6

It would be trained to recognize despite distortions, as a human can.

It would be trained to recognize despite distortions, as a human can.

6 comments

[–] PMYA 1 points (+1|-0)

Fair use doesn't have to be length based. I could put Scarface on Youtube in its entirety and fart all the way through it, and argue that it comes under fair use because I am critiquing or making a parody. It doesn't have to be good or even make sense, if I can argue that it should be classed as fair use, nobody can remove it. I'm not sure about educational fair use though, don't know anything about that.

Even if you had something capable of flagging things with a decent amount of accuracy, what would it use as a database to compare Youtube videos to and decide if something should be removed? It would need to perform a reverse video search of sorts.

[–] phoxy [OP] 1 points (+1|-0)

You make good points. Fair use might torpedo the idea for the forseeable future.

The way categorization by neural network would work is it would "watch" the video and spit out a high dimensional vector (1000 dimensions for example). Similar content would have similar vectors, like content based hashing but with each dimension of the vector having a semantic meaning. The network might learn to recognize comedy or length or actors as a dimension.

If the copyright holder submits the work as a reference, similar videos can be identified because they would have very similar vectors. The vectors of copied videos would cluster together in the vector space. An upload of Scarface dubbed with farts would be close to the original in many vector dimensions except for the dimensions which encode audio information. Adjusting the distance threshold from the reference vector allows you to fine tune which videos get flagged, much like it works now where you can avoid getting flagged by distorting the upload subtly.