[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

I'm currently programming an alternative software to things

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 18
Thread images: 5

File: IMG_0031.png (66KB, 1000x1000px) Image search: [Google]
IMG_0031.png
66KB, 1000x1000px
I'm currently programming an alternative software to things such as Writecheck, Turnitin and Viper for my students and I was wondering if there were any current exploits (i.e. scripting) or anything else that could potentially bypass the system. As I want to patch these in my updated programme - specifically for the submission of Word document files.

Just as a heads up I'm aware of methods that include inserting quotes, foreign letters/symbols, using images of text and merely paraphrasing (none of which work under most plagiarism checker algorithms anyway).
>>
File: IMG_0029.png (4KB, 225x225px) Image search: [Google]
IMG_0029.png
4KB, 225x225px
B-BUMP
>>
File: image.jpg (64KB, 700x504px) Image search: [Google]
image.jpg
64KB, 700x504px
>>307257
Twelve years on, and people are still trying to enumerate badness.

http://www.ranum.com/security/computer_security/editorials/dumb/
>>
>>307310
ok but that didn't answer the question
>>
>>307313
Yeah it did.

Parse the file, extract the text, check the text, and pass on to the markers only the text you've checked.

This way the markers see the exact text your software sees, and if you cleverly hide your essay from the plagiarism check, then you also cleverly hide it from the guy who's marking it.
>>
>>307314
Could this be done on a Word (2016) file document?
>>
>>307314
Also wouldn't this have the effect of changing the outward appearance of the text bc I tried doing something similar with the document.xml file of a Word document in Notepad++ and it didn't work.
>>
>>307320
Of course it would. That is the whole point.

Look, I don't feel good shitting on your idea, but based on your past few threads you seem to be in woefully over your head. Parsing out raw text is supposed to be the easy bit; how the hell are you going to make a natural-language parser that can spot plagiarism against a big-data collection of essays (and where are you even going to get your dataset from) if this is the bit you're stuck on?
>>
>>307315
Load it into Word, select all the text, then open a new document and paste it as unstyled text.

This can all be done procedurally using VBA.

The advantages of using Word instead of writing your own parser are:

- guaranteed to parse the same way Word does
- don't have to write a .docx parser, because you just bought one in
- regular security updates from one of the largest companies in the world

It's the last one that's really important. You can't write a parser that deals with untrusted input. Trust me, you can't. Even the big boys struggle, and it's a regular thing to see malformed input vulnerabilities in Office, Chrome, Firefox, etc. Microsoft, Google, etc. have spent multiple man-years just on their input validation. You on your own have no chance whatsoever.
>>
Is there a way to keep the original outward appearance of text in a Word document but have software 'read' it as being something completely different?
>>
File: IMG_0032.png (698KB, 600x640px) Image search: [Google]
IMG_0032.png
698KB, 600x640px
>>307326
Thanks I acknowledge that I have a huge project ahead of me but I think I have a slight (?) advantage of being on the lookout for exploits because I'm operating on a smaller scale. The dataset will be from a few small classes.
>>
>>307330
Of course there is.

This is why you pass to the invigilators the text your software "read", and only the text your software "read"
>>
File: IMG_0033.jpg (50KB, 316x474px) Image search: [Google]
IMG_0033.jpg
50KB, 316x474px
>>307333
Yes, of course. But how would a student go about doing that (changing the internal strings read by the software)? What kind of methods would they use to achieve this? I want to try to be one step ahead and make sure they know I'm aware of how they do that.
>>
>>307331
>I have a slight (?) advantage of being on the lookout for exploits because I'm operating on a smaller scale. The dataset will be from a few small classes.
I think you're missing the point: if one of your submissions is off a cheating website, how will you spot that with a dataset that doesn't include anything from that cheating website?

If all you're comparing your submissions to is the work of a few small classes, there's pretty-much no point in what you're doing.
>>
>>307334
See >>307310.

Your approach is fundamentally flawed, because there are so many ways to do steganography in a format as complicated as .docx that it's not even possible to count them all.
>>
>>307335
The documents will be written by students who do not have internet access at the time of writing, but they will be typing in the same room together unsupervised. A really specific scenario I know but it's what I've beeen requested to make.
>>
>>307336
Very true, I've looked into steganography but there are so many conflicting ways on how to do it on Word docs. What's the easiest method?
>>
>>307338
So why not just make them type in notepad?

I'm not seeing the threat model here: a number of students so small that a single marker can and will spot plagiarism by hand is going to copy stuff from somewhere that isn't the Internet, and then is going to (whilst being supervised by an invigilator) hide the plagiarised essay they got from nowhere under a steganographic layer containing a second essay they somehow have time to write in order to confuse an NIH plagiarism checker that has nothing to do in the first place?

This has all the hallmarks of an in-house vanity project, and you owe it to whoever's paying your budget to kill
it off now while you've not wasted too much money on it.
Thread posts: 18
Thread images: 5


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.