Most of us have come to accept that some of our information is going to be tracked when using the Internet. We have gotten used to seeing ads for those watches we were looking at on Amazon weeks ago showing up on Facebook. Most people do not even bother reading privacy policies anymore but that does not mean it is no longer important to know what kind of information is being tracked and how it is being collected.

Researchers at Princeton University’s Center for Information Technology Policy (CITP) have discovered that more of your information is being tracked than you might know. Their study has uncovered that several popular websites are using scripts that log every keystroke and mouse click and save recordings of them to third-party servers. Even if you cancel or abandon the web form, everything you typed is still recorded and saved.

The keylogging software, called “session replay scripts,” is being openly used by multiple sites. The scripts are usually employed by third-party providers such as FullStory, SessionCam, Clicktale, SmartLook, UserReplay, Hotjar and Yandex. Administrators can pull up any recorded session and play it back like a video.

According to lead researcher Steve Englehardt, most people do not even realize they are being tracked in this manner since session replay disclosures are buried “deep into the privacy policy.”

“I’m just happy that users will be made aware of it,” Englehardt told Motherboard in a telephone interview.

Englehardt and his colleagues, Gunes Acar and Arvind Narayanan, studied six of the seven session replay providers mentioned above and found that software from one company was being used on 482 of the top 50,000 sites (as ranked by Alexa). Of the nearly 500 listed websites, there are several well-known names including WordPress, Microsoft, Spotify, Xfinity and Walgreens.

Upon being presented with the research, Walgreens issued a statement.

“We take the protection of our customers’ data very seriously and are investigating the claims made in the study that was published yesterday. As we look into the concerns that were raised, and out of an abundance of caution, we have stopped sharing data with FullStory.”

Bonobos, another company identified in the list, told Wired that they have also stopped sharing data with FullStory. “We are continually assessing and strengthening systems and processes in order to protect our customers’ data,” the spokesperson said.

“Collection of page content by third-party replay scripts may cause sensitive information such as medical conditions, credit card details, and other personal information displayed on a page to leak to the third-party as part of the recording,” warn the researchers. It is also possible for passwords to be revealed despite the fact that the software is supposed to redact them.

There are tools included with the session replay scripts that can be used to redact sensitive information but in testing the software, CITP found that some data is only partially redacted or not removed at all. On Walgreens' website, for instance, data such as medical conditions, prescriptions and users’ real names were being collected despite having redaction protocols in place.

Regardless of how trustworthy companies like FullStory and the others may or may not be, the researchers see a concern with those firms being targets for malicious attacks. They point to Yandex, Hotjar and SmartLook as examples which operate session replay dashboards on unencrypted HTTP rather than secure HTTPS pages.

Thanks to the team’s research, session replay providers are reviewing their practices as well. Yandex and SmartLook are already looking into ways to improve the security of their dashboards.

Kevin Goodings, CEO of SessionCam, stated, "Everyone at SessionCam can get behind the CITP’s conclusion: ‘Improving user experience is a critical task for publishers. However, it shouldn’t come at the expense of user privacy.’ The whole team at SessionCam lives these values every day. The privacy of your website visitors and the security of your data is of paramount importance to us.”

If you would like to see the 482 websites that are confirmed to be using session replay scripts, the list is published on Princeton’s Web Transparency website.

Image and video courtesy Princeton University