Description
CourseVillain is an Embry-Riddle Aeronautical University project started in 2018 by some professors and student programmers in response to copyrighted university content appearing on the Course Hero website. On the CourseVillain website, professors, course designers, or student assistants can create an account, then create a "scanner" for any course they are in charge of. Once per day, CourseVillain will scrape the Course Hero website for new content relating to this course and report it to the user via email. If the document is copyrighted and belongs to the user, they can have the program automatically file a DMCA violation with Course Hero that asks them to take down the content.
My role in this project was continuing what the previous programmers had started. I finished things they left unfinished, updated and refined the scraping and form automation processes, and created user guides for regular users and future developers. I also co-authored a research paper written about the project titled "Trying to Catch Lightning in a Bottle: Surveying Plagiarism Futures".
One major problem I addressed with the structure of the program was that Course Hero added a Google ReCaptcha to the copyright infringement form since the initial development of CourseVillain. Many anti-bot tactics were also implemented by Course Hero to try and prevent us from accessing their website. The automated form filling could no longer be done in the background from a server since the user had to complete the captcha. This was solved by switching from using Puppeteer to automate filling out the form to instead following the legal DMCA rules and submitting emails to Course Hero. This was a much easier and less resource-intensive process than before and was less prone to error.
Last Updated
Feb 2022
Released
Feb 2020
Technologies
- Node.js
- MongoDB
- Puppeteer
- Express
- Passport
- Angular