Facebook Scraped Data Issue Surfaces in Vietnam

SafetyDetectives Cybersecurity Team SafetyDetectives Cybersecurity Team

The security research team, led by Anurag Sen, at Safety Detectives has uncovered a significant leak of Facebook data. As much as 3 gigabytes of scraped Facebook user data was found on an Elastic server, which raises additional concerns regarding the company’s security measures.

This follows not only the Cambridge Analytica scandal of March 2018, but a previous data scrape of Facebook users by hackers purportedly based in Vietnam in January 2020. The data that our research found is on top of what was already found, and adds another 12 million records to the list. Many, but not all, of the entries included full details of personally identifying information (PII), stemming from multiple sources – Facebook included. We still do not know who is ultimately responsible for this scrape and how they were able to perform such an extensive and invasive action.

Since discovering the leak, the server has subsequently been taken offline.

What is Data Scraping?

Data scraping is a means of extracting private information from a website. It’s a fairly common practice, with several vendors providing tools to allow anyone to scrape data.

Most data scraping is completely innocuous and carried out by web developers, business intelligence analysts, honest businesses such as travel booker sites, as well as being done for market research purposes online. It is only when the practice is weaponized and done with a specific goal of extracting personal information, can negative consequences occur.

Social media companies such as Facebook will allow users to access third-party websites by using their existing Facebook login information. However, if security protocols are not properly instituted, it can allow hackers with so-called ‘scraper bots’ to extract private information.

Why is Data Scraping dangerous?

When private information – such as login details, addresses, and birth information – is extracted, it can allow unauthorized users to commit heinous acts, including identity theft and financial fraud.

Given the fact that such leaks are often automated with bots conducting all of the data extraction, it can mean millions of innocent users can have their information leaked within a short period of time. It is worth noting that Facebook currently has around 2.2 billion monthly users.

Data can then be sold or provided to other malicious parties, thereby making the potential ramifications of data scraping wide-ranging and severe.

What has been leaked?

The Elastic server provided data related to 12 million Facebook users, with as much as 3 gigabytes leaked.

What has been leaked?

Number of Records Leaked: 12 million Facebook users

Size: 3 gigabytes

Location: Vietnam

Of the scraped and leaked data, it is important to note that much of the information included is not meant to be publicly visible, especially without the knowledge and approval of the user.

Data included in the breach:

  • Full name
  • Hometown location
  • Current location
  • Education details
  • Family relations with other Facebook users
  • Birthdates
  • GPS coordinates
  • Email addresses
  • Facebook usernames and IDs
  • Profile scores
What has been leaked?

PII, including birthdate, city of residence, gender, email address, and the individual’s unique Facebook ID

What has been leaked?

What has been leaked?

Education information about each user, including unique Facebook user IDs and school IDs.

Potential Ramifications

  • Identity theft
  • Targeted crimes against individuals
  • Blackmail and extortion
  • Unsolicited marketing
  • Potential account takeover (enough information was included to uncover login and password recovery information)

Facebook concerns

This latest breach in Vietnam is particularly sensitive because of Facebook’s recent history with data scraping. The beginning of the year saw an older leak of Vietnamese Facebook users’ data, and this discovery shows that the extent of the threat against said data goes even further than believed. In March 2018, a political consulting group called Cambridge Analytica was able to ‘harvest’ – or scrape – personal data related to 57 million Facebook users. The number of people affected was later revised up to 87 million, with Facebook declaring that resolving the vulnerabilities would be a “multi-year process”.

The incident made headlines around the world because of the political connections and suspected impact on US elections. In response, Facebook decided to lock down some of its API functions, including data scraping, in order to make this practice more difficult to conduct. The US social media giant also blocked users from using its reverse search tool — a means of using snippets of data to identify and garner even larger data sets.

It was this feature that allowed malicious actors to scrape Facebook user data, and the social media company stated that it was resolved.

Clearly, there are still data-scraping vulnerabilities that can be exploited, especially where there is a mismatch of security protocols being implemented by third-party websites and Facebook.

Preventing Data Exposure

How can you prevent your personal information from being exposed in a data leak and ensure that you’re not a victim of attacks – both online and offline – if it is leaked?

  • Be cautious of what information you give out and to whom
  • Check that the website you’re on is secure (look for a https designation and/or a closed lock in the URL address bar)
  • Only give out what you feel confident cannot be used against you (avoid government ID information and sensitive personal information that can compromise you, if made public)
  • Create secure passwords by combining letters, numbers and symbols – a password manager such as Dashlane can help with this
  • Do not click links in emails unless you are sure that the sender is legitimate and trustworthy
  • Double-check any social media accounts (even ones you no longer use) to ensure that the privacy of your posts and personal details are visible only to people you trust
  • Avoid using credit card information and typing out passwords over unsecured WiFi networks
  • Conduct further research into what constitutes ‘cybercrime’ and remain updated to the latest hacks and cyber threats online such as phishing attacks and ransomware.

About Us

SafetyDetective.com is the world’s largest antivirus review website. The Safety Detective research lab is a pro bono service that aims to help the online community defend itself against cyber threats, while educating organizations on protecting their users’ data. You can view some of our top antivirus recommendations here.

Published on: May 21, 2020

About the Author
SafetyDetectives Cybersecurity Team
SafetyDetectives Cybersecurity Team
SafetyDetectives Cybersecurity Team

About the Author

The SafetyDetectives research lab is a pro bono service that aims to help the online community defend itself against cyber threats while educating organizations on how to protect their users’ data. The overarching purpose of our web mapping project is to help make the internet a safer place for all users