Privacy

Man Behind LinkedIn Scraping Said He Grabbed 700 Million Profiles 'For Fun' (9to5mac.com) 27

The man behind last month's scraping of LinkedIn data, which exposed the location, phone numbers, and inferred salaries of 700 million users, says that he did it "for fun" -- though he is also selling the data. 9to5Mac reports: BBC News spoke with the man who took the data, under the name Tom Liner: "How would you feel if all your information was catalogued by a hacker and put into a monster spreadsheet with millions of entries, to be sold online to the highest paying cyber-criminal? That's what a hacker calling himself Tom Liner did last month 'for fun' when he compiled a database of 700 million LinkedIn users from all over the world, which he is selling for around $5,000 [...]. In the case of Mr Liner, his latest exploit was announced at 08:57 BST in a post on a notorious hacking forum [...] 'Hi, I have 700 million 2021 LinkedIn records,' he wrote. Included in the post was a link to a sample of a million records and an invite for other hackers to contact him privately and make him offers for his database."

Liner says he was also behind the scraping of 533 million Facebook profiles back in April (you can check whether your data was grabbed): "Tom told me he created the 700 million LinkedIn database using 'almost the exact same technique' that he used to create the Facebook list. He said: 'It took me several months to do. It was very complex. I had to hack the API of LinkedIn. If you do too many requests for user data in one time then the system will permanently ban you.'"

Databases

The Case Against SQL (scattered-thoughts.net) 297

Long-time Slashdot reader RoccamOccam shares "an interesting take on SQL and its issues from Jamie Brandon (who describes himself as an independent researcher who's built database engines, query planners, compilers, developer tools and interfaces).

It's title? "Against SQL." The relational model is great... But SQL is the only widely-used implementation of the relational model, and it is: Inexpressive, Incompressible, Non-porous. This isn't just a matter of some constant programmer overhead, like SQL queries taking 20% longer to write. The fact that these issues exist in our dominant model for accessing data has dramatic downstream effects for the entire industry:

- Complexity is a massive drag on quality and innovation in runtime and tooling
- The need for an application layer with hand-written coordination between database and client renders useless most of the best features of relational databases

The core message that I want people to take away is that there is potentially a huge amount of value to be unlocked by replacing SQL, and more generally in rethinking where and how we draw the lines between databases, query languages and programming languages...

I'd like to finish with this quote from Michael Stonebraker, one of the most prominent figures in the history of relational databases:

"My biggest complaint about System R is that the team never stopped to clean up SQL... All the annoying features of the language have endured to this day. SQL will be the COBOL of 2020..."

It's been interesting to follow the discussion on Twitter, where the post's author tweeted screenshots of actual SQL code to illustrate various shortcomings. But he also notes that "The SQL spec (part 2 = 1732) pages is more than twice the length of the Javascript 2021 spec (879 pages), almost matches the C++ 2020 spec (1853) pages and contains 411 occurrences of 'implementation-defined', occurrences which include type inference and error propagation."

His Twitter feed also includes a supportive retweet from Rust creator Graydon Hoare, and from a Tetrane developer who says "The Rust of SQL remains to be invented. I would like to see it come."
Government

EPA Approved Toxic Chemicals For Fracking a Decade Ago, New Files Show (nytimes.com) 137

An anonymous reader quotes a report from The New York Times: For much of the past decade, oil companies engaged in drilling and fracking have been allowed to pump into the ground chemicals that, over time, can break down into toxic substances known as PFAS -- a class of long-lasting compounds known to pose a threat to people and wildlife -- according to internal documents from the Environmental Protection Agency. The E.P.A. in 2011 approved the use of these chemicals, used to ease the flow of oil from the ground, despite the agency's own grave concerns about their toxicity, according to the documents, which were reviewed by The New York Times. The E.P.A.'s approval of the three chemicals wasn't previously publicly known. The records, obtained under the Freedom of Information Act by a nonprofit group, Physicians for Social Responsibility, are among the first public indications that PFAS, long-lasting compounds also known as "forever chemicals," may be present in the fluids used during drilling and hydraulic fracturing, or fracking.

In a consent order issued for the three chemicals on Oct. 26, 2011, E.P.A. scientists pointed to preliminary evidence that, under some conditions, the chemicals could "degrade in the environment" into substances akin to PFOA, a kind of PFAS chemical, and could "persist in the environment" and "be toxic to people, wild mammals, and birds." The E.P.A. scientists recommended additional testing. Those tests were not mandatory and there is no indication that they were carried out. "The E.P.A. identified serious health risks associated with chemicals proposed for use in oil and gas extraction, and yet allowed those chemicals to be used commercially with very lax regulation," said Dusty Horwitt, researcher at Physicians for Social Responsibility. [...] There is no public data that details where the E.P.A.-approved chemicals have been used. But the FracFocus database, which tracks chemicals used in fracking, shows that about 120 companies used PFAS -- or chemicals that can break down into PFAS; the most common of which was "nonionic fluorosurfactant" and various misspellings -- in more than 1,000 wells between 2012 and 2020 in Texas, Arkansas, Louisiana, Oklahoma, New Mexico, and Wyoming. Because not all states require companies to report chemicals to the database, the number of wells could be higher. Nine of those wells were in Carter County, Okla., within the boundaries of Chickasaw Nation. "This isn't something I was aware of," said Tony Choate, a Chickasaw Nation spokesman. [...] The findings underscore how, for decades, the nation's laws governing various chemicals have allowed thousands of substances to go into commercial use with relatively little testing. The E.P.A.'s assessment was carried out under the 1976 Toxic Substances Control Act, which authorizes the agency to review and regulate new chemicals before they are manufactured or distributed.
"[T]he Toxic Substances Control Act grandfathered in thousands of chemicals already in commercial use, including many PFAS chemicals," the report says. "In 2016, Congress strengthened the law, bolstering the E.P.A.'s authority to order health testing, among other measures. The Government Accountability Office, the watchdog arm of Congress, still identifies the Toxic Substances Control Act as a program with one of the highest risks of abuse and mismanagement." According to a recent report from the Intercept, "the E.P.A. office in charge of reviewing toxic chemicals tampered with the assessments of dozens of chemicals to make them appear safer."
Republicans

Hackers Scrape 90,000 GETTR User Emails, Surprising No One (vice.com) 75

Just days after its launch, hackers have already found a way to take advantage of GETTR's buggy API to get the username, email address, and location of thousands of users. Motherboard reports: Hackers were able to scrape the email addresses and other data of more than 90,000 GETTR users. On Tuesday, a user of a notorious hacking forum posted a database that they claimed was a scrape of all users of GETTR, the new social media platform launched last week by Trump's former spokesman Jason Miller, who pitched it as an alternative to "cancel culture." The data seen by Motherboard includes email addresses, usernames, status, and location. One of the people whose email is in the database confirmed to Motherboard that they are indeed registered to GETTR. Motherboard also verified the database by attempting to create an account with three email addresses that appear in the database. When doing that, the site displayed the message: "The email is taken," suggesting it's already registered. It's unclear if the database contains the usernames and email addresses of all users on the site. Alon Gal, the co-founder and CTO of cybersecurity firm Hudson Rock, found the forum post with the database. "When threat actors are able to extract sensitive information due to neglectful API implementations, the consequence is equivalent to a data breach and should be handled accordingly by the firm and to be examined by regulators," he told Motherboard in an online chat.
Crime

A Threat to Privacy in the Expanded Use of License Plate-Scanning Cameras? (yahoo.com) 149

Long-time Slashdot reader BigVig209 shares a Chicago Tribune report "on how suburban police departments in the Chicago area use license plate cameras as a crime-fighting tool." Critics of the cameras note that only a tiny percentage of the billions of plates photographed lead to an arrest, and that the cameras generally haven't been shown to prevent crime. More importantly they say the devices are unregulated, track innocent people and can be misused to invade drivers' privacy. The controversy comes as suburban police departments continue to expand the use of the cameras to combat rising crime. Law enforcement officials say they are taking steps to safeguard the data. But privacy advocates say the state should pass a law to ensure against improper use of a nationwide surveillance system operated by private companies.

Across the Chicago area, one survey by the nonprofit watchdog group Muckrock found 88 cameras used by more than two dozen police agencies. In response to a surge in shootings, after much delay, state police are taking steps to add the cameras to area expressways. In the northwest suburbs, Vernon Hills and Niles are among several departments that have added license plate cameras recently. The city of Chicago has ordered more than 200 cameras for its squad cars. In Indiana, the city of Hammond has taken steps to record nearly every vehicle that comes into town.

Not all police like the devices. In the southwest suburbs, Darien and La Grange had issues in years past with the cameras making false readings, and some officers stopped using them...

Homeowner associations may also tie their cameras into the systems, which is what led to the arrest in Vernon Hills. One of the leading sellers of such cameras, Vigilant Solutions, a part of Chicago-based Motorola Solutions, has collected billions of license plate numbers in its National Vehicle Location Service. The database shares information from thousands of police agencies, and can be used to find cars across the country... Then there is the potential for abuse by police. One investigation found that officers nationwide misused agency databases hundreds of times, to check on ex-girlfriends, romantic rivals, or perceived enemies. To address those concerns, 16 states have passed laws restricting the use of the cameras.

The article cites an EFF survey which found 99.5% of scanned plates weren't under suspicion — "and that police shared their data with an average of 160 other agencies."

"Two big concerns the American Civil Liberties Union has always had about the cameras are that the information can be used to track the movements of the general population, and often is sold by operators to third parties like credit and insurance companies."
EU

OpenStreetMap Looks To Relocate To EU Due To Brexit Limitations (theguardian.com) 99

OpenStreetMap, the Wikipedia-for-maps organisation that seeks to create a free and open-source map of the globe, is considering relocating to the EU, almost 20 years after it was founded in the UK by the British entrepreneur Steve Coast. From a report: OpenStreetMap Foundation, which was formally registered in 2006, two years after the project began, is a limited company registered in England and Wales. Following Brexit, the organisation says the lack of agreement between the UK and EU could render its continued operation in Britain untenable. "There is not one reason for moving, but a multitude of paper cuts, most of which have been triggered or amplified by Brexit," Guillaume Rischard, the organisation's treasurer, told members of the foundation in an email sent earlier this month.

One "important reason," Rischard said, was the failure of the UK and EU to agree on mutual recognition of database rights. While both have an agreement to recognise copyright protections, that only covers work which is creative in nature. Maps, as a simple factual representation of the world, are not covered by copyright in the same way, but until Brexit were covered by an EU-wide agreement that protected databases where there had been "a substantial investment in obtaining, verifying or presenting the data." But since Brexit, any database made on or after 1 January 2021 in the UK will not be protected in the EU, and vice versa.

Security

LinkedIn Breach Reportedly Exposes Data of 92% of Users, Including Inferred Salaries (9to5mac.com) 47

A second massive LinkedIn breach reportedly exposes the data of 700M users, which is more than 92% of the total 756M users. The database is for sale on the dark web, with records including phone numbers, physical addresses, geolocation data, and inferred salaries. 9to5Mac reports: RestorePrivacy reports that the hacker appears to have misused the official LinkedIn API to download the data, the same method used in a similar breach back in April: "On June 22nd, a user of a popular hacker advertised data from 700 Million LinkedIn users for sale. The user of the forum posted up a sample of the data that includes 1 million LinkedIn users. We examined the sample and found it to contain the following information: Email Addresses; Full names; Phone numbers; Physical addresses; Geolocation records; LinkedIn username and profile URL; Personal and professional experience/background; Genders; and Other social media accounts and usernames."

With the previous breach, LinkedIn did confirm that the 500M records included data obtained from its servers, but claimed that more than one source was used. PrivacyShark notes that the company has issued a similar statement this time: "While we're still investigating this issue, our initial analysis indicates that the dataset includes information scraped from LinkedIn as well as information obtained from other sources. This was not a LinkedIn data breach and our investigation has determined that no private LinkedIn member data was exposed. Scraping data from LinkedIn is a violation of our Terms of Service and we are constantly working to ensure our members' privacy is protected."

Intel

Intel To Disable TSX By Default On More CPUs With New Microcode (phoronix.com) 46

Intel is going to be disabling Transactional Synchronization Extensions (TSX) by default for various Skylake through Coffee Lake processors with forthcoming microcode updates. Phoronix reports: Transactional Synchronization Extensions (TSX) have been around since Haswell for hardware transactional memory support and going off Intel's own past numbers can be around 40% faster in specific workloads or as much 4~5 times faster in database transaction benchmarks. TSX issues have been found in the past such as a possible side channel timing attack that could lead to KASLR being defeated and CVE-2019-11135 (TSX Async Abort) for an MDS-style flaw. Now in 2021 Intel is disabling TSX by default across multiple families of Intel CPUs from Skylake through Coffee Lake. [...] The Linux kernel is preparing for this microcode change as seen in the flow of new patches this morning for the 5.14 merge window.

A memory ordering issue is what is reportedly leading Intel to now deprecate TSX on various processors. There is this Intel whitepaper (PDF) updated this month that outlines the problem at length. As noted in the revision history, the memory ordering issue has been known to Intel since at least before October 2018 but only now in June 2021 are they pushing out microcode updates to disable TSX by default. With forthcoming microcode updates will effectively deprecate TSX for all Skylake Xeon CPUs prior to Stepping 5 (including Xeon D and 1st Gen Xeon Scalable), all 6th Gen Xeon E3-1500m v5 / E3-1200 v5 Skylake processors, all 7th/8th Gen Core and Pentium Kaby/Coffee/Whiskey CPUs prior to 0x8 stepping, and all 8th/9th Gen Core/Pentium Coffee Lake CPUs prior to 0xC stepping will be affected. That ultimately spans from various Skylake steppings through Coffee Lake; it was with 10th Gen Comet Lake and Ice Lake where TSX/TSX-NI was subsequently removed.

In addition to disabling TSX by default and force-aborting all RTM transactions by default, a new CPUID bit is being enumerated with the new microcode to indicate that the force aborting of RTM transactions. It's due to that new CPUID bit that the Linux kernel is seeing patches. Previously Linux and other operating systems applied a workaround for the TSX memory ordering issue but now when this feature is disabled, the kernel can drop said workaround. These patches are coming with the Linux 5.14 cycle and will likely be back-ported to stable too.

China

Scientist Finds Early Virus Sequences That Had Been Mysteriously Deleted (seattletimes.com) 336

UPDATE (7/30): All the missing virus sequences have now been published, with their deletion being explained as just "an editorial oversight by a scientific journal," according to the New York Times.

In Slashdot's original report, an anonymous reader quoted another report from The New York Times: About a year ago, genetic sequences from more than 200 virus samples from early cases of Covid-19 in Wuhan disappeared from an online scientific database. Now, by rooting through files stored on Google Cloud, a researcher in Seattle reports that he has recovered 13 of those original sequences -- intriguing new information for discerning when and how the virus may have spilled over from a bat or another animal into humans. The new analysis, released on Tuesday, bolsters earlier suggestions that a variety of coronaviruses may have been circulating in Wuhan before the initial outbreaks linked to animal and seafood markets in December 2019. As the Biden administration investigates the contested origins of the virus, known as SARS-CoV-2, the study neither strengthens nor discounts the hypothesis that the pathogen leaked out of a famous Wuhan lab. But it does raise questions about why original sequences were deleted, and suggests that there may be more revelations to recover from the far corners of the internet.
UPDATE (6/25): The Washington Post notes the data wasn't exactly suppressed. "Processed forms of the same data were included in a preprint paper from Chinese scientists posted in March 2020 and, after peer review, published that June in the journal Small." And in addition: The NIH released a statement Wednesday saying that a researcher who originally published the genetic sequences asked for them to be removed from the NIH database so that they could be included in a different database. The agency said it is standard practice to remove data if requested to do so...

Bloom's paper acknowledges that there are benign reasons why researchers might want to delete data from a public database. The data cited by Bloom are not alone in being removed by the NIH during the pandemic. The agency, in response to an inquiry from The Post, said the National Library of Medicine has so far identified eight instances since the start of the pandemic when researchers had withdrawn submissions to the library.

"This one from China and the rest from submitters predominantly in the U.S.," the NIH said in its response. "All of those followed standard operating procedures."

The New York Times writes: The genetic sequences of viral samples hold crucial clues about how SARS-CoV-2 shifted to our species from another animal, most likely a bat. Most precious of all are sequences from early in the pandemic, because they take scientists closer to the original spillover event. As [Jesse Bloom, a virologist at the Fred Hutchinson Cancer Research Center who wrote the new report] was reviewing what genetic data had been published by various research groups, he came across a March 2020 study with a spreadsheet that included information on 241 genetic sequences collected by scientists at Wuhan University. The spreadsheet indicated that the scientists had uploaded the sequences to an online database called the Sequence Read Archive, managed by the U.S. government's National Library of Medicine. But when Dr. Bloom looked for the Wuhan sequences in the database earlier this month, his only result was "no item found." Puzzled, he went back to the spreadsheet for any further clues. It indicated that the 241 sequences had been collected by a scientist named Aisi Fu at Renmin Hospital in Wuhan. Searching medical literature, Dr. Bloom eventually found another study posted online in March 2020 by Dr. Fu and colleagues, describing a new experimental test for SARS-CoV-2. The Chinese scientists published it in a scientific journal three months later. In that study, the scientists wrote that they had looked at 45 samples from nasal swabs taken "from outpatients with suspected Covid-19 early in the epidemic." They then searched for a portion of SARS-CoV-2's genetic material in the swabs. The researchers did not publish the actual sequences of the genes they fished out of the samples. Instead, they only published some mutations in the viruses.

But a number of clues indicated to Dr. Bloom that the samples were the source of the 241 missing sequences. The papers included no explanation as to why the sequences had been uploaded to the Sequence Read Archive, only to disappear later. Perusing the archive, Dr. Bloom figured out that many of the sequences were stored as files on Google Cloud. Each sequence was contained in a file in the cloud, and the names of the files all shared the same basic format, he reported. Dr. Bloom swapped in the code for a missing sequence from Wuhan. Suddenly, he had the sequence. All told, he managed to recover 13 sequences from the cloud this way. With this new data, Dr. Bloom looked back once more at the early stages of the pandemic. He combined the 13 sequences with other published sequences of early coronaviruses, hoping to make progress on building the family tree of SARS-CoV-2. Working out all the steps by which SARS-CoV-2 evolved from a bat virus has been a challenge because scientists still have a limited number of samples to study. Some of the earliest samples come from the Huanan Seafood Wholesale Market in Wuhan, where an outbreak occurred in December 2019. But those market viruses actually have three extra mutations that are missing from SARS-CoV-2 samples collected weeks later. In other words, those later viruses look more like coronaviruses found in bats, supporting the idea that there was some early lineage of the virus that did not pass through the seafood market. Dr. Bloom found that the deleted sequences he recovered from the cloud also lack those extra mutations. "They're three steps more similar to the bat coronaviruses than the viruses from the Huanan fish market," Dr. Bloom said. This suggests, he said, that by the time SARS-CoV-2 reached the market, it had been circulating for awhile in Wuhan or beyond. The market viruses, he argued, aren't representative of full diversity of coronaviruses already loose in late 2019.

UPDATE (7/30): When republishing their sequences, the researchers indicated they actually came from January 30, 2020 (and not "late 2019").
Social Networks

A Real Estate Mogul Will Spend $100 Million to Fix Social Media Using Blockchain (msn.com) 93

"Frank McCourt, the billionaire real estate mogul and former owner of the Los Angeles Dodgers, is pouring $100 million into an attempt to rebuild the foundations of social media," reports Bloomberg: The effort, which he has loftily named Project Liberty, centers on the construction of a publicly accessible database of people's social connections, allowing users to move records of their relationships between social media services instead of being locked into a few dominant apps.

The undercurrent to Project Liberty is a fear of the power that a few huge companies — and specifically Facebook Inc. — have amassed over the last decade... Project Liberty would use blockchain to construct a new internet infrastructure called the Decentralized Social Networking Protocol. With cryptocurrencies, blockchain stores information about the tokens in everyone's digital wallets; the DSNP would do the same for social connections. Facebook owns the data about the social connections between its users, giving it an enormous advantage over competitors. If all social media companies drew from a common social graph, the theory goes, they'd have to compete by offering better services, and the chance of any single company becoming so dominant would plummet.

Building DSNP falls to Braxton Woodham, the co-founder of the meal delivery service Sun Basket and former chief technology officer of Fandango, the movie ticket website... McCourt hired Woodham to build the protocol, and pledged to put $75 million into an institute at Georgetown University in Washington, D.C., and Sciences Po in Paris to research technology that serves the common good. The rest of his $100 million will go toward pushing entrepreneurs to build services that utilize the DSNP...

A decentralized approach to social media could actually undermine the power of content moderation, by making it easier for users who are kicked off one platform to simply migrate their audiences to more permissive ones. McCourt and Woodham say blockchain could discourage bad behavior because people would be tied to their posts forever...

Eventually, the group plans to create its own consumer product on top of the DSNP infrastructure, and wrote in a press release that the eventual result will be an "open, inclusive data economy where individuals own, control and derive greater social and economic value from their personal information."

Science

The First 'Google Translate' For Elephants Debuts (scientificamerican.com) 50

An anonymous reader quotes a report from Scientific American: Elephants possess an incredibly rich repertoire of communication techniques, including hundreds of calls and gestures that convey specific meanings and can change depending on the context. Different elephant populations also exhibit culturally learned behaviors unique to their specific group. Elephant behaviors are so complex, in fact, that even scientists may struggle to keep up with them all. Now, to get the animals and researchers on the same page, a renowned biologist who has been studying endangered savanna elephants for nearly 50 years has co-developed a digital elephant ethogram, a repository of everything known about their behavior and communication.

[Joyce Poole, co-founder and scientific director of ElephantVoices, a nonprofit science and conservation organization, and co-creator of the new ethogram] built the easily searchable public database with her husband and research partner Petter Granli after they came to realize that scientific papers alone would no longer cut it for cataloging the discoveries they and others were making. The Elephant Ethogram currently includes more than 500 behaviors depicted through nearly 3,000 annotated videos, photographs and audio files. The entries encompass the majority, if not all, of typical elephant behaviors, which Poole and Granli gleaned from more than 100 references spanning more than 100 years, with the oldest records dating back to 1907. About half of the described behaviors came from the two investigators' own studies and observations, while the rest came from around seven other leading savanna elephant research teams.

While the ethogram is primarily driven by Poole and Granli's observations, "there are very few, if any, examples of behaviors described in the literature that we have not seen ourselves," Poole points out. The project is also just beginning, she adds, because it is meant to be a living catalog that scientists actively contribute to as new findings come in. Poole and Granli believe the exhaustive, digitized Elephant Ethogram is the first of its kind for any nonhuman wild animal. The multimedia-based nature of the project is important, Poole adds, because with descriptions based only on the written word, audio files or photographs, "it is hard to show the often subtle differences in movement that differentiate one behavior from another." Now that the project is online, Poole hopes other researchers will begin contributing their own observations and discoveries, broadening the database to include cultural findings from additional savanna elephant populations and unusual behaviors Poole and Granli might have missed.

Twitter

Twitter Restricts Accounts In India To Comply With Government Legal Request (techcrunch.com) 48

An anonymous reader quotes a report from TechCrunch: Twitter disclosed on Monday that it blocked four accounts in India to comply with a new legal request from the Indian government. The American social network disclosed on Lumen Database, a Harvard University project, that it took action on four accounts -- including those of hip-hop artist L-Fresh the Lion and singer and song-writer Jazzy B -- to comply with a legal request from the Indian government it received over the weekend. The accounts are geo-restricted within India but accessible from outside of the South Asian nation. (As part of their transparency efforts, some companies including Twitter and Google make requests and orders they receive from governments and other entities public on Lumen Database.)

All four accounts, like several others that the Indian government ordered to be blocked in the country earlier this year, had protested New Delhi's agriculture reforms and some had posted other tweets that criticized Prime Minister Narendra Modi's seven years of governance in India, an analysis by TechCrunch found. The new legal request, which hasn't been previously reported, comes at a time when Twitter is making efforts to comply with the Indian government's new IT rules, new guidelines that several of its peers including Facebook and Google have already complied with. On Saturday, India's Ministry of Electronics and Information Technology had given a "final notice" to Twitter to comply with its new rules, which it unveiled in February this year. The new rules require significant social media firms to appoint and share contact details of representatives tasked with compliance, nodal point of reference and grievance redressals to address on-ground concerns.
Last month, police in Delhi visited Twitter offices to "serve a notice" to Twitter's India head. Twitter responded by calling the visit a form of intimidation, and requested the government respect citizens' rights to free speech.
United States

Supreme Court Narrows Scope of CFAA Computer Hacking Law (therecord.media) 79

The United States Supreme Court has ruled today in a 6-3 vote to overturn a hacking-related conviction for a Georgia police officer, and by doing so, it also narrowed down the scope of the US' primary hacking law, the Computer Fraud and Abuse Act. From a report: The ruling, No. 19-783, comes in the Van Buren v. United States case of Nathan Van Buren, a former police sergeant in Cumming, Georgia, who was sentenced to 18 months in prison in May 2018 for taking a bribe of $5,000 to look up a license plate for a woman one of his informants met at a local strip club. Prosecutors charged Van Buren under the CFAA and argued that even if the police officer had been authorized to access the police database as part of his work duties, he "exceeded authorized access" when he performed a search against department internal policies. In subsequent appeals, Van Buren argued that the "exceeds authorized access" language in the CFAA was too broad and requested that the US Supreme Court rule on the matter, in a case the court decided to pick up and heard arguments last year.
Technology

Rescuers Question What3words' Use in Emergencies (bbc.com) 122

AmiMoJo writes: Mountain rescuers have questioned the accuracy of using a location app, citing dozens of examples where the wrong address was given to their teams. What3Words (W3W) divides the world into three-by-three metre squares, each with a three-word address. It is free and used by 85% of UK emergency services. Reasons for the errors were not given, but were likely to be things such as mispronunciation or spelling errors. W3W said human error was "a possibility with any type of tool." The mapping system was created by an algorithm which assigned three words to each square in the world. Mark Lewis, the head of ICT at Mountain Rescue England and Wales (MREW), said that the use of the W3W app had been "testing" for rescue teams. He gave the BBC a database from the last 12 months which listed 45 locations across England and Wales that rescuers received from lost or injured walkers and climbers, which turned out to be incorrect.
United States

Two New Laws Restrict Police Use of DNA Search Method (nytimes.com) 80

New laws in Maryland and Montana are the first in the nation to restrict law enforcement's use of genetic genealogy, the DNA matching technique that in 2018 identified the Golden State Killer, in an effort to ensure the genetic privacy of the accused and their relatives. From a report: Beginning on Oct. 1, investigators working on Maryland cases will need a judge's signoff before using the method, in which a "profile" of thousands of DNA markers from a crime scene is uploaded to genealogy websites to find relatives of the culprit. The new law, sponsored by Democratic lawmakers, also dictates that the technique be used only for serious crimes, such as murder and sexual assault. And it states that investigators may only use websites with strict policies around user consent. Montana's new law, sponsored by a Republican, is narrower, requiring that government investigators obtain a search warrant before using a consumer DNA database, unless the consumer has waived the right to privacy.

The laws "demonstrate that people across the political spectrum find law enforcement use of consumer genetic data chilling, concerning and privacy-invasive," said Natalie Ram, a law professor at the University of Maryland who championed the Maryland law. "I hope to see more states embrace robust regulation of this law enforcement technique in the future." Privacy advocates like Ms. Ram have been worried about genetic genealogy since 2018, when it was used to great fanfare to reveal the identity of the Golden State Killer, who murdered 13 people and raped dozens of women in the 1970s and '80s. After matching the killer's DNA to entries in two large genealogy databases, GEDmatch and FamilyTreeDNA, investigators in California identified some of the culprit's cousins, and then spent months building his family tree to deduce his name -- Joseph James DeAngelo Jr. -- and arrest him.

Privacy

Clearview AI Hit With Sweeping Legal Complaints Over Controversial Face Scraping in Europe (theverge.com) 10

Privacy International (PI) and several other European privacy and digital rights organizations announced today that they've filed legal complaints against the controversial facial recognition company Clearview AI. From a report: The complaints filed in France, Austria, Greece, Italy, and the United Kingdom say that the company's method of documenting and collecting data -- including images of faces it automatically extracts from public websites -- violates European privacy laws. New York-based Clearview claims to have built "the largest known database of 3+ billion facial images."

PI, NYOB, Hermes Center for Transparency and Digital Human Rights, and Homo Digitalis all claim that Clearview's data collection goes beyond what the average user would expect when using services like Instagram, LinkedIn, or YouTube. "Extracting our unique facial features or even sharing them with the police and other companies goes far beyond what we could ever expect as online users," said PI legal officer Ioannis Kouvakas in a joint statement.

Open Source

Redditors Aim to 'Free Science' From For-Profit Publishers (interestingengineering.com) 63

A group of Redditors came together in a bid to archive over 85 million scientific papers from the website Sci-Hub and make an open-source library that cannot be taken down. Interesting Engineering reports: Over the last decade or so, Sci-Hub, often referred to as "The Pirate Bay of Science," has been giving free access to a huge database of scientific papers that would otherwise be locked behind a paywall. Unsurprisingly, the website has been the target of multiple lawsuits, as well as an investigation from the United States Department of Justice. The site's Twitter account was also recently suspended under Twitter's counterfeit policy, and its founder, Alexandra Elbakyan, reported that the FBI gained access to her Apple accounts.

Now, Redditors from a subreddit called DataHoarder, which is aimed at archiving knowledge in the digital space, have come together to try to save the numerous papers available on the website. In a post on May 13, the moderators of r/DataHoarder, stated that "it's time we sent Elsevier and the USDOJ a clearer message about the fate of Sci-Hub and open science. We are the library, we do not get silenced, we do not shut down our computers, and we are many." This will be no easy task. Sci-Hub is home to over 85 million papers, totaling a staggering 77TB of data. The group of Redditors is currently recruiting for its archiving efforts and its stated goal is to have approximately 8,500 individuals torrenting the papers in order to download the entire library. Once that task is complete, the Redditors aim to release all of the downloaded data via a new "uncensorable" open-source website.

Businesses

FTC is Prodding the Tech Giant To Punish Fake-Review Schemers (vox.com) 29

An anonymous reader shares a report: Amazon recently banned some sellers of large Chinese electronics brands like Aukey and Mpow that reportedly do hundreds of millions in sales on the shopping site each year. The bans followed a database leak that appeared to tie some of the brands to paid-review schemes, which Amazon prohibits and says it strictly polices. But while some press coverage implied that Amazon took these actions in response to the database leak, internal employee messages viewed by Recode show that pressure from the Federal Trade Commission (FTC) led to at least one of the notable bans.

Communications between Amazon employees viewed by Recode also appear to expose an inconsistent punishment system in which employees need special approval for suspending certain sellers because of their sales numbers, while some merchants are able to keep selling products to Amazon customers despite multiple policy violations and warnings. The leaked internal messages also revealed several other instances in recent months of FTC inquiries pressuring Amazon to take action against merchants engaging in fake-review schemes. Amazon has long said that it aggressively polices fake reviews, but the frequency with which the FTC has pressured the company to police merchants that run paid-review programs has not been previously known.

Medicine

99.992% of Fully Vaccinated People Have Dodged COVID, CDC Data Shows (arstechnica.com) 143

An anonymous reader quotes a report from Ars Technica: Cases of COVID-19 are extremely rare among people who are fully vaccinated, according to a new data analysis by the Centers for Disease Control and Prevention. Among more than 75 million fully vaccinated people in the US, just around 5,800 people reported a "breakthrough" infection, in which they became infected with the pandemic coronavirus despite being fully vaccinated. The numbers suggest that breakthroughs occur at the teeny rate of less than 0.008 percent of fully vaccinated people -- and that over 99.992 percent of those vaccinated have not contracted a SARS-CoV-2 infection.

The figures come from a nationwide database that the CDC set up to keep track of breakthrough infections and monitor for any concerning signs that the breakthroughs may be clustering by patient demographics, geographic location, time since vaccination, vaccine type, or vaccine lot number. The agency will also be keeping a close eye on any breakthrough infections that are caused by SARS-CoV-2 variants, some of which have been shown to knock back vaccine efficacy. [...] The extraordinary calculation that 99.992 percent of vaccinated people have not contracted the virus may reflect that they all simply have not been exposed to the virus since being vaccinated. Also, there are likely cases missed in reporting. Still, the data is a heartening sign.
As for the "breakthroughs," the agency says many of them occurred in older people, who are more vulnerable to COVID-19. There are some scattered through every age group, but more than 40 percent were in people ages 60 and above.

"We see [breakthroughs] with all vaccines," top infectious disease expert Anthony Fauci said in a press briefing earlier this week. "No vaccine is 100 percent efficacious or effective, which means that you will always see breakthrough infections regardless of the efficacy of your vaccine."
Facebook

There's Another Facebook Phone Number Database Online (vice.com) 7

An online tool lets customers pay to unmask the phone numbers of Facebook users that liked a specific Page, and the underlying dataset appears to be separate from the 500 million account database that made headlines last week, signifying another data breach or large scale scraping of Facebook users' data, Motherboard reports. From the report: Motherboard verified the tool, which comes in the form of a bot on the social network and messaging platform Telegram, outputs accurate phone numbers of Facebook users that aren't included in the dataset of 500 million users. The data also appears to be different to another Telegram bot outputting Facebook phone numbers that Motherboard first reported on in January. "Hello, can you tell me how you got my number?" one person included in the dataset asked Motherboard when reached for comment. "Omg, this is insane," they added. Another person returned Motherboard's call and, after confirming their name, said "If you have my number then yes it seems the data is accurate."

A description for the bot reads "The bot give [sic] out the phone numbers of users who have liked the Facebook page." To use the bot, customers need to first identify the unique identification code of the Facebook Page they want to get phone numbers from, be that a band, restaurant, or any other sort of Page. This is possible with at least one free to use website. From there, customers enter that code into the bot, which provides a cost of the data in U.S. dollars and the option to proceed with the purchase, according to Motherboard's tests. A Page with tens of thousands of likes from Facebook users can cost a few hundred dollars, the bot shows. The data for Motherboard's own Page would return 134,803 results and cost $539, for example.

Slashdot Top Deals