No Child Left Un-Mined? Student Privacy at Risk in the Age of Big Data

A multibillion-dollar ed-tech industry says it needs to track student performance to improve education, but privacy advocates and parents worry about children's data being exploited for profit.

On Facebook, it’s the season where parents are posting pictures of K-12 graduations, including moppets in tiny mortarboards. But unlike a generation ago, today’s smallest graduates are amassing a big data trail. Just as medical and government files have been digitized — some to be anonymized and sold; all susceptible to breaches — student data has entered the realm of the valuable and the vulnerable. Parents are paying attention. A recent study by the company The Learning Curve found that while 71 percent of parents believe technology has improved their child’s education, 79 percent were concerned about the privacy and security of their child’s data, and 75 percent worried about advertiser access to that data.

The fear is that the multi-billion-dollar education technology (or “ed-tech”) industry that seeks to individualize learning and reduce drop-out rates could also pose a threat to privacy, as a rush to commercialize student data could leave children tagged for life with indicators based on their childhood performance.

“What if potential employers can buy the data about you growing up and in school?” asks mathematician Cathy O’Neil, who’s finishing a book on big data and blogs at mathbabe.org. In some of the educational tracking systems, which literally log a child’s progress on software keystroke by keystroke, “We’re giving a persistence score as young as age 7 — that is, how easily do you give up or do you keep trying? Once you track this and attach this to [a child’s] name, the persistence score will be there somewhere.” O’Neil worries that just as credit scores are now being used in hiring decisions, predictive analytics based on educational metrics may be applied in unintended ways.

Such worries came to the fore last week when educational services giant Pearson announced that it was selling the company PowerSchool, which tracks student performance, to a private equity firm for $350 million. The company was started independently; sold to Apple; then to Pearson; and now to Vista Equity Partners. Each owner in turn has to decide how to manage the records of some 15 million students across the globe, according to Pearson. The company did not sign an initiative called the Student Privacy Pledge, whose signatories promise not to sell student information or behaviorally target advertising (151 other companies including Google have signed the non-binding pledge).

A Pearson spokesperson said, “We do not use personal student data to sell or market Pearson products or services. The data is entrusted to us as a part of our work with schools and institutions and is guarded by federal and state laws.  From a security perspective, when an education institution or agency entrusts Pearson with personally identifiable student information, we work directly with the organization to ensure the data is protected and our controls are consistent with relevant requirements.”

PowerSchool intakes a large variety of data. Its site touts administrator tools including discipline management and reporting; student and staff demographics; and family management. Brendan O’Grady, VP of media and communities for Pearson, says the company has provided ways of allowing educators to track the performance of individual students and groups of students in order to serve them better. “Big data and all of the associated technologies have really improved all of the technologies in the world, the way we travel and communicate and more,” he says. “But we haven’t seen a similar advance in the way we use data in education. There are very legitimate questions about data security and around what works best for schools. But there should be some very positive experiences using big data to give better feedback on what needs to be learned. That’s the biggest opportunity.”

The biggest flame-out so far in the ed-tech arena has been inBloom, a company that had a stellar lineup of support — $100 million of it — from sources including the Bill & Melinda Gates Foundation and the Carnegie Corporation of New York. In Louisiana, parents were incensed that school officials had uploaded student social security numbers to the platform. After several other states ended relationships, the last remaining client — New York — changed state law to forbid giving student data to companies storing it in dashboards and portals. InBloom announced in April 2014 that it was shutting down.

When inBloom first launched under the name the Shared Learning Collaborative (SLC), Vicki Phillips of the Gates Foundation described the venture as “a huge app store — just for teachers — with the Netflix and Facebook capabilities we love the most.” (The Gates Foundation did not respond to a request for comment.) An ed-tech industry source said that by launching with a pitch for developers to build apps tailored to the platform, rather than messaging to parents and educators concerned about privacy, inBloom put itself on a collision course with concerned parents.

One of them was Leonie Haimson, who runs the advocacy group Class Size Matters. It opposed SLC/inBloom, and its partnership with Wireless Generation, which did educational database work for New York State and is owned by Rupert Murdoch’s News Corporation. “I was concerned about inBloom because they were going to aggregate student data from at least 9 states [and seek more state partners], and put it into an easily digestible form, offering contracts to vendors who created educational products. But,” she adds, “you can’t expect vendors to tell parents what they’re doing. You need the schools and the districts to take responsibility for telling them what data is being shared for what reason and what conditions.” Mathematician O’Neil agrees: “I’ve worked as a data scientist and for venture capitalists, they’ll ask me to talk to entrepreneurs about their ideas. It’s extremely unsettling. The perspective of the entrepreneurs in big data, they’re just trying to figure out how to make money and what the laws and regulations are. But the regulations are nowhere.”

While battling inBloom, Haimson found out that a federal regulation called the Family Educational Rights and Privacy Act of 1974, or FERPA, had been weakened in recent years, making it easier for schools to share student data and personally identifiable information without parental consent. Today, several bills pending in Congress, including the Protecting Student Privacy Act in the Senate and the Student Digital Privacy and Parental Rights Act of 2015 in the House, aim to tighten up privacy regulations. None seems fast-tracked for passage. That said, Haimson sees a coalition between left and right developing around student privacy, as it has around some other privacy and civil liberties issues.

But the approaches and data sets held by ed-tech companies vary widely. The company Clever has a very different model than inBloom. It’s a platform that organizes permissions for apps used in classrooms, offering its own single sign-in login (akin to the “login with Facebook” function on many commercial websites) so teachers and kids don’t have to remember multiple passwords. Clever is free to schools and charges developers. It also takes far less data than some ed tech companies — roster information (name, teacher, class), not grades or disciplinary records. Tyler Bosmeny, Clever’s CEO, says, “Technology has tremendous potential to improve the lives of students and teachers. But none of it will come to pass if we don’t set higher standards for student data security. That’s exactly what Clever is working with thousands of schools across the country to do.” Clever was the chosen platform of an app called Share My Lesson developed by the American Federation of Teachers, which was vocal in criticizing inBloom and Pearson.

Nonetheless, even for Clever, mastering data security has not been a seamless effort. Last year, the company was criticized for having a clause in their privacy policy saying it could be changed without school consent or notification. In response, the company created a system for soliciting public comments on the policy, and then, taking critiques into account, forged a new policy requiring the company to give schools notice before privacy changes take effect and time for them to opt out.

Concerns about corporate use of student data are the most prevalent. But New York University professor and computer scientist Meredith Broussard says, “Teachers are vulnerable. When they don’t have budgets for software, they’re encouraged to use free online resources. When teachers force students to use free sites they’re essentially giving away student data for free.” In the context of tight education budgets, where public school teachers are frequently forced to dig into their own pockets to pay for basic classroom supplies, such free services are tempting — but teachers can themselves inadvertently become the conduit of leaky student data. Broussard recounts the tale of a teacher who didn’t like the in-house attendance system. She found a free platform online, created a system for entering attendance and grades, and got many others in the school to use it. Was it secure? Probably not.

States are left trying to navigate the need to aid teachers with learning and administrative technology with the concerns of parents. Lan Neugent is the interim executive director of the State Educational Technology Directors Association (SETDA). “There’s a fine balance between companies holding and utilizing data and the decision makers in schools, parents, students impacted by data,” he says. “Every time there’s a target or a home depot or a big data breach, people say it’s dangerous to have data out there — which it can be. But technology gives kids vehicles for individualized learning. Our members would say there’s not enough research being done.”

Researchers like professor Susan Dynarski of the University of Michigan argue that over-regulating student data can hurt research. In a piece for the New York Times, she wrote that one of the several federal bills introduced “would effectively end the analysis of student data by outside social scientists. This legislation would have banned recent prominent research documenting the benefits of smaller classes, the value of excellent teachers and the varied performance of charter schools.”

Susan McGregor, a data journalist and the assistant director of Tow Center for Digital Journalism at Columbia University, sees the need for research but adds, “Both in the popular consumer sphere and from the research perspective the way we’re handling big data privacy doesn’t work.” University review boards have more stringent rules about how information is anonymized than many private companies. Still, she says, “these days you can take a small data set from one place and cross it with another data set and de-anonymize people. And we’re in a commercial culture that’s very interested in collecting everything forever and never destroying it.”

Some European regulations require explicit terms-of-use for consumer data permissions, rather than the blanket open-ended ones that tend to exist in the United States. In order to improve the system, says McGregor, there should be a “privacy first” approach to using student data. Requirements should include strong technical personnel, parents and students (particularly once they are adults) able to view and raise any corrections to data, a strong opt-out provision, and assurances the data will never be used for anything other than research.

Cathy O’Neil says the questions facing ed-tech are not just technical, but sociopolitical: It is critical to ask who is targeted for services, and why. “I know a bunch of people who were at inBloom. Some people there were thoughtful, but not the ones in charge. They wanted to turn it into the next Facebook. ‘We’re going to ‘win’ education.’ These were all white rich people who don’t understand the complexity of what’s going on in inner city schools,” she continues. “The belief that data can solve problems that are our deepest problems, like inequality and access, is wrong. Whose kids have been exposed by their data is absolutely a question of class.”

Correction: An earlier version of this piece stated that Class Size Matters is partially funded by the National Education Association. Although Class Size Matters accepted $25,000 from the NEA in 2010, executive director Leonie Haimson says it did so as the fiscal agent for Parents Across America and that Class Size Matters has accepted no additional funding from the union since. We regret the error.

Photo: Getty Images

Join The Conversation