Facebook's 'Rosetta' System Helps the Company Understand Text Within Image, Which is Crucial In Handling Memes, Flagging Abusing Content (techcrunch.com)
Facebook announced on Tuesday a new AI system, codenamed "Rosetta," which helps teams at the company as well as those at Instagram identify text within images to better understand what their subject is and more easily classify them for search or to flag abusive content. From a report: It's not all memes; the tool scans over a billion images and video frames daily across multiple languages in real time, according to a company blog post. Rosetta makes use of recent advances in optical character recognition (OCR) to first scan an image and detect text that is present, at which point the characters are placed inside a bounding box that is then analyzed by convolutional neural nets that try to recognize the characters and determine what's being communicated. This technology has been in practice for a while -- Facebook has been working with OCR since 2015 -- but implementing this across the company's vast networks provides a crazy degree of scale that motivated the company to develop some new strategies around character detection and recognition.
It will be used to suppress speech.
How is content ie simple information/knowledge abusive? Does it come out of the screen and berate you?
"We can now identify conservative and Trump-supporting users much more rapidly in order to ban them for #WrongThink!"
Of course this tech is being spun to "save the children", but it is also used to screen all advertisements that run on FB. They do not want ads to contain much text - less than 20% of the area of the ad image can be text. This is detected automatically using the technology described, and their system will stop the ad if it doesn't meet that requirement.
We've found that images with less than 20% text perform better.
To create a better experience for audiences and advertisers, ads that run on Facebook, Instagram and Audience Network are subject to a review process that looks at the amount of image text used in your ad. Based on this review, ads with higher amounts of image text may not be shown. Keep in mind that some ad images may qualify for an exception. For example, book covers, album covers and product images usually qualify for an exception.
https://www.facebook.com/busin...
And from the blurb:
detect text that is present, at which point the characters are placed inside a bounding box
Thus the area of the bounding boxes (after performing a union) can be at most 20% of the area of the image.
Better known as 318230.
Now we should start writing text on memes using Captcha fonts.