[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

What do you use to find duplicate files, images, porn, etc.

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 87
Thread images: 10

File: 0408-thumb-100031240-large[1].jpg (36KB, 580x292px) Image search: [Google]
0408-thumb-100031240-large[1].jpg
36KB, 580x292px
What do you use to find duplicate files, images, porn, etc. on your drives?
>>
>>57898015
Brain and eyes 'n shit like that.
>>
>>57898015
fslint
>>
look for matching checksums. then find similarly named files and inspect them manually.
>>
>>57898015

https://www.hardcoded.net/dupeguru/

Dupeguru
>>
>>57898079
/thread
>>
>>57898079
>muh windows hate
lolz
>>
>>57898418
developing for windows is a pain in the ass. it's not hate just because they don't accommodate babies.
>>
>The last time I used Windows "for real" was more than 10 years ago, in 2005.
>I hate Windows with passion now. It seems to get everything backwards.

this is 4chan levels of 'i've never ever had it, used it, seen it or experienced it, but I HATE IT' lmao
>>
>>57898030
its difficult to manage 2.5TB of porn and filter out duplicates when you have 10k+ images and 1k+ videos
>>
>>57898015
fslint
>>
I use my brain.
>>
Awsome Photo Finder for pictures
>>
>>57898015
Dup Detector
>>
>>57898418
>>57899237
Developing for Windows is *the fucking worst*. Everything is so ass-backwards, fucked up, and overcomplicated.

Microsoft makes very few considerations for developers that aren't paying big money to suck their dick. This is Microsoft's entire business model. It is irrefutable. It was bad in 2005 and it's worse now. Go ahead and try to get started writing a quick program in C++ on Linux. Do the same on Windows. If you genuinely think it's easier to start on Windows, please go back to /v/; you're probably a fucking idiot.
>>
CCleaner has a Duplicate Finder under "Tools"
>>
>>57898015
use btrfs and don't care about shit
>>
>>57899269
ignore retards. this board is full of people whose use of technology pretty much ends at "collecting 20 reaction images"
>>
>>57899430
This.
>download 1TB of visual studio
>>
>>57899457
Explain I know it's fs
>>
>>57898015
md5sum 2bh

Everything else is botnet
>>
>>57899707
BRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR UFFFFF
>>
>>57899768
*sniff*
>>
VisiPics. A hash checker is one thing but this program is actually pretty good at finding images that look similar as well.
>>
>>57898430
>object-oriented API
>best IDEs (Visual Studio, C++ Builder, etc.)
>most comprehensive documentation
>great support
How utterly incompetent do you have to be to fail at Windows development?
>>
File: honkerito.png (60KB, 184x184px) Image search: [Google]
honkerito.png
60KB, 184x184px
>he doesn't have a filesystem that supports deduplication
>>
'Awesome Duplicate Photo Finder' is pretty good for finding matching images with different resolutions, haven't found any other tools that work as well as it
>>
>>57903277
Is it open source?
>>
>>57903327
Sadly I don't think so, and last time I checked I couldn't find any good OS tools like it

Wish it was OS though since there's a few things I'd like to add, eg letting you compare the image by hovering over it
>>
>>57903277
Is there something like this that works with webms?
>>
>>57898015
fdupes
>>
>>57903396
Haven't seen anything like it, you could probably hack something up to make contact sheets (pic related) from your webms and compare those tho
>>
>>57903429
Sauce
>>
>>57903038
I assume you're either using btrfs, in which case lmao, or zfs+dedup, in which case lmao
>>
>>57903455
mless.com/1075BE3
>>
I just leave dupes all over the place because I'm fucking terrible at categorizing stuff and will never find it where it's supposed to be
>>
>>57898015
rmlint
>>
>>57898015
find -exec md5sum {} \+ | sort -k1 | uniq -w32 -D

find files and execute md5sum, sort by the md5s, and then find duplicate md5s and print them. this takes care of strictly identical files

for visually similar images i use an image-based fingerprinting algorithm in place of md5, and i test for "close" fingerprints instead of perfect matches
>>
>>57903648

That would take forever.

Either:

1- parallelize it to compute many md5sums at the same time
2- only compare md5sums for files of same size (so sort by size first)
>>
>>57903721
3- use an existing tool that already does all of this better
>>
>>57898015
common sense
>>
>>57899269
>>57899269
>>57899269
http://www.video-comparer.com/

only good solution for video that I have seen, I paid for it.
>>
File: Screenshot_20161208-183241~01.png (593KB, 1440x2001px) Image search: [Google]
Screenshot_20161208-183241~01.png
593KB, 1440x2001px
Well there's btrfs which is great

I like meld.. it's OK
>>
>>57898015
duff
>>
>>57906295
>Well there's btrfs which is great
top meme
>>
>>57906428
What zfs
>>
Try vistanita duplicate finder.
It's very easy to use and it's full of features.
It also has different modes for finding bit-by-bit duplicates, pictures of different res or quality, duplicated music, etc.
You can also set many different parameters to finetune your research.
I use it almost every day, I love it.
If anyone wants, I can upload my cracked copy in an hour or so, when I'll be on the pc.
>>
>>57907066
What english
>>
File: Untitled.png (229KB, 2179x1303px) Image search: [Google]
Untitled.png
229KB, 2179x1303px
>>57898015
I know my porn very well. I don't download it twice :^) Only solo/lesbian, btw
>>
>>57898015
check md5
>>
>>57898015
Find locate sort uniq
>>
>>57898015
find -not -empty -type f -printf "%s\n" | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate

fdupes works pretty good for 4chan downloaded jpgs. (filename and md5sum)
>>
>>57907601
>MD5
http://www.mscs.dal.ca/~selinger/md5collision/
>>
>>57898015
fdupes -rdS /path/to/folder
Finds duplicate files, shows you the filenames + size, and asks which to keep. I use it to clean my reaction images folder every once in a while.
>>
Anyone know tools which can look for duplicates inside zip files too? I tend to archive stuff and it'd be a pain extracting everything just to check for duplicates inside.
>>
Nothing, I don't give a shit.
>>
>>57907726
Looking for dupe reaction faces, not verifying nuke missile launch codes.
>>
>>57907356
You gonna share with the class here anon?
>>
File: IMG_0262.jpg (15KB, 300x168px) Image search: [Google]
IMG_0262.jpg
15KB, 300x168px
>>57898030
Best post.
>>
>>57899430
>>57898079

literally the developer sperging

i'd make a joke about how you're probably desperate for donations but you're so autistic you don't even accept them LOL!
>>
>>57901866
>windows
>most comprehensive docs
>>
>>57907184
Before the thread 404s, here it is if anyone wants it:
http://www58.zippyshare.com/v/j2xkenLO/file.html
>>
>>57910910
nice malware
>>
>>57911226
Actually yeah, it could have malware kek, as I've probably downloaded it from a sketchy source.
Inside that zip there are the .rar archive as I've downloaded them, so you can scan them if you want.
If there's malware, I certainly wasn't the one wo put it there.
>>
>>57911252
sure that sounds very believable
>>
File: vt.png (12KB, 570x258px) Image search: [Google]
vt.png
12KB, 570x258px
>>57911252
Upload the file on VirusTotal and check the date of the first time it has been scanned.

https://virustotal.com/en/file/15f15086de102939941f2b0e784309f434810df6afe75ab2665ed0020bffd117/analysis/

This obviously means it hasn't been touched since.
>>
Fuck, this >>57911406 was meant for >>57911370
>>
>>57911406
>detection ratio: 1/52
lol

Also,
1. virus scanners are trivial to defeat. You can take any malware and slightly modify it to bypass the check
2. just because you wrote your epik malware a few years back doesn't somehow make it stop being malware
>>
>>57911442
>Implying that's not the obvious false positive from the cracked exe
>Implying malware made (and spread around the Internet) 6+ years ago wouldn't be immediately detected now
>Implying I give a shit if you install it or not
>Implying I'd stay here putting all this effort to make a couple of tech-savvy people install my old malware, when I could easily put it somewhere else and let hundreds of tech-illiterates download it without having to defend myself.

I'm just trying to be helpful. If you don't want my version, you can find it elsewhere, or not download it at all for all I give a shit.
>>
>>57911497
Also the 3.9.5 actually seems to be risky, so don't install that.
I just had it in my folder and uploaded it without checking.
>>
File: 1.jpg (144KB, 1080x1080px) Image search: [Google]
1.jpg
144KB, 1080x1080px
Bumps
>>
>>57911497
>Implying that's not the obvious false positive from the cracked exe
Yes, yes, I'm sure your crack is a “false positive” ;-)

>steps to install
>0: turn off ur antivirus
>1: run exe
>2: click “ok” on all driver certificate warnings that show up
>3: IGNORE any antivirus messages, they are a FALSE POSITIVE
>tested 100% clean cracked by SKIDROW
>>
>>57911649
Wut?
Where did you read that?
It's not in either of the folders.
>>
>>57903415
>>
>>57898015
SimilarImages for images it´s so far the only one that i could find that is able to compared over 50 gigs of images and find those that are similar
>>
>>57898079
Wincuccs REKT
>>
>>57901866
Do I need to point out how I know you are not a dev with that shitpost?
>>
File: 2016-11-14-014730_530x473_scrot.png (269KB, 530x473px) Image search: [Google]
2016-11-14-014730_530x473_scrot.png
269KB, 530x473px
I use a filesystem with native dedup since I'm not a pleb.
>what are inodes
>>
>>57912509
and what filesystem might that be, memelord?
>>
>>57898015
Mk I Eyeball
>>
http://www.nirsoft.net/utils/search_my_files.html
>>
>>57912628
zfs on my storage server running freebsd.
Wouldn't want to be running the hack linux port.
>>
>>57912739
>he has zfs dedup turned on
hahaha oh wow

enjoy your 0.X% storage gains in exchange for nuking your ARC size
>>
>>57912763
I've got 64GiB of RAM, and a 256GB SSD for cache. Why don't you?
>>
>>57912713
Nice , just tested it

IT JST WRRRKS

It is sometimes hard to trust links on /g , I do not want to become bitcoin miner for someone else

but this time it is safe
>>
>>57912819
S-Sorry, my L2ARC is only 128 GB because I had to sacrifice one of my SSDs for a barebones build

Still, I have 64 GiB of RAM as well and I have dedup turned off, because the gain is not worth the cost no matter how much spare RAM you have.
>>
>>57912819
>>57912895
ps. show us your `zpool get dedupratio`
Thread posts: 87
Thread images: 10


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.