How app developers are hacking your brain to boost ratings

The implications of the inflation are far-reaching. Millions of companies use some kind of mobile app to reach Apple’s nearly 1 billion users. Commerce on the App Store grew to more than $US500 billion ($692 billion) last year – more than most nations’ gross domestic product.

The average user, globally, spends 27 per cent of their daily waking hours on a mobile device, according to App Annie, a mobile data and analytics provider. Apple faces criticism – and a lawsuit from Fortnite developer Epic Games – for the 30 per cent fee it charges on revenue generated through the App Store. But that applies only to the 16 per cent of apps that charge fees, whereas ratings inflation affects every app.

Competition between apps is intense, so garnering a high score is critical. Apptentive, a reputation management group, calls ratings the “lifeblood of the mobile app world”. Its research suggests that jumping from two stars to three stars can increase downloads by 306 per cent, while the leap from three stars to four delivers a 92 per cent boost. Gummicube, which helps companies with App Store optimisation, says four-fifths of users do not trust an app with ratings below 4 stars.

The average user, globally, spends 27 per cent of their daily waking hours on a mobile device. Glenn Hunt

“Everyone is incentivised to paint this positive world,” says Khalifah. “The developer gets more installs, Apple gets more commission – it’s this snowball effect where you get more and more positivity.

“The problem,” he adds, “is that the truth gets rained on.”

The trigger for this inflation was Apple’s seemingly innocuous update to boost consumer engagement in September 2017. Users no longer had to go to the App Store to rate an app – a system that often only attracted frustrated users.

Multiple loopholes

Instead, with the introduction of iOS 11, Apple granted developers the ability to offer “in-app prompts”. The virtue of these prompts was they generated participation and, arguably, overcame the “responder bias” that gave a loudspeaker to negative voices. Targeting a wider swath of people was meant to increase accuracy.

In one sense, it was a big success. Engagement soared. The average app went from receiving 19,000 ratings in 2017 to more than 100,000 in 2019, according to Apptentive. By contrast, ratings in Google’s Play Store – which did not offer in-app ratings in this period – climbed only from 33,000 to 43,000.

But the way Apple designed the system has allowed developers to exploit multiple loopholes and steer consumers into inflating their ratings, say critics. By allowing developers to request the in-app prompt at a time of their choosing, developers can achieve “sample bias” by zeroing in on their fans and avoid asking users deemed a risk.

Apple requires developers to use a standard interface that asks for a one- to five-star rating, which it says is designed to collect honest feedback. However, developers can introduce “framing bias”. If they prompt users with a positive note – such as “congratulations on hitting a high score!” – then solicit them for a rating just after, the chances of a five-star rating improve.

The tech giant prohibits developers from prompting users with a message that says “How would you rate this app?”, seeing the answer, then asking for an official App Store rating. However, developers can still “prime” consumers by tweaking the question. Video conferencing apps can ask “How was the quality of your call?” to suss out the five-star responses – and only then ask Apple for the official ratings prompt.

As companies get better and better at manipulating the scores, the ratings systems themselves become less and less useful to consumers.

— Rob Markey, a consultant at Bain & Company

“What they are doing is tipping the scales in their favour, in a public rating,” says Rob Markey, a consultant at Bain & Company and co-creator of Net Promoter Score, a metric that helps companies measure, manage and improve customer loyalty. “As companies get better and better at manipulating the scores, the ratings systems themselves become less and less useful to consumers.”

Other platforms have experienced issues with inflated ratings. Amazon is investigating the most prolific reviewers on its British website after a Financial Times investigation found evidence that they were profiting from posting thousands of five-star ratings.

Apple users are allowed to opt out of receiving in-app prompts. Moreover they can, at any time, go to the App Store and write a negative review, and Apple does not allow developers to block them. However, it does allow app makers to “reset” their ratings, and because in-app prompts are so effective at getting ordinary users to tap 5 stars, negative vibes can be drowned out. Sikorsky cites one client whose app had 1090 one-star reviews, but within weeks of changing the feedback mechanism the app received more than 35,000 ratings – with 90 per cent giving it 5 stars.

Bans for developers

“It has very much been engineered,” says Wendy Johansson, a user experience designer at consultancy Publicis Sapient.

Apple has tried to prevent developers from nudging users into giving a higher rating and threatens to ban developers who violate the rules. In response to questions from the Financial Times, Apple says it has removed apps from the App Store, and developers from its Apple Developer Program, for breaking its rules.

“Our App Store Review Guidelines make it clear that any developer who attempts to cheat the system, such as by manipulating ratings or how their app appears in search results, may have their app taken down and could be removed from the Developer Program,” Apple adds.

Yet there is evidence that developers have found numerous ways to game the system, without violating Apple’s rules. When asked about their tactics, developers point to Apple’s own in-app prompt guidelines, which state: “Make the request when users are most likely to feel satisfaction with your app, such as when they’ve completed an action, level, or task.”

For Khalifah, one unintended consequence of Apple’s framework was limiting developers from asking individual users for a rating to just three times a year, per app. This was designed to avoid irritating consumers, but in effect it made the in-app prompts a scarce commodity. That incentivised developers to build “Black Mirror-style algorithms” – a reference to the British dystopian technology TV series – to figure out when users were most happy, he says.

As a result, says Levine, App Store ratings have been compromised, to the benefit of dominant players. “It’s anti-competitive, because only the big companies with more money are able to take advantage of this situation effectively,” he adds.

He argues that higher ratings can stifle innovation, because developers can create a mediocre app and still garner a 4.5-star average rating. “A lot of apps aren’t being worked on as much as they should because all the indications are that customers like them,” he says.

Widespread ratings inflation

Exactly how much ratings have soared is difficult to pinpoint, as Apple does not provide complete ratings data and history. But third parties have documented widespread ratings inflation after the introduction of iOS 11.

Among America’s seven biggest banking apps, ratings that varied between 1.2 and 4.9 stars in early 2017 are now 4.8 stars for all. In the Google Play store for Android devices, the highest rated among these same apps is 4.7 — the lowest is 4.4.

Even the apps ranked 50th most popular in the categories for shopping, lifestyle, finance, travel and entertainment are all rated at least 4.8 stars in the App Store. In the Play Store, apps with the same ranking vary between 3.8 and 4.7 stars, according to App Annie.

When Levine analysed a cluster of eight popular apps that had introduced the in-app ratings prompt, he found the average score climbed from 3 stars to 4.7 stars within six months, while the number of user ratings shot up by a factor of 62.

Even lowly ranked apps can have high ratings in the App Store. AP

Sandwich chain Subway struggled with poor app ratings for years before its score jumped from 1.7 stars to 4 stars within two weeks in early 2018. A note for the software update said it resolved a few minor bugs, while the main new feature was “[making] it easier to rate the app and provide feedback”.

The idea that higher ratings simply reflect better quality apps for the iPhone and iPad is contradicted by data showing that ratings with written reviews attached have experienced no inflation at all.

“We actually see a drop in the average review score on iOS among all apps and games, from 4.2 in August 2017 to 3.9 in Sept 2017, to 3.4 by July 2020,” says Lexi Sydow, senior market insights manager at App Annie.

Written reviews no longer carry much weight, as developers can filter out many of the one-star ratings and amplify higher scores without even using sophisticated techniques.

The most simple method, says Apptentive, whose clients include eBay, CNN, and Alaska Airlines, is called “the love dialogue”.

It recommends that developers prime users with a simple message. “Do you love [this app]?” When a user clicks “no”, they are directed towards a private feedback channel. When they click “yes”, they receive Apple’s official “rate this app” interface.

Ashley Sefferman, Apptentive’s head of content, says she does not consider this “gaming”. Rather, it helps developers channel “actionable” feedback and hear more from their fans.

However, Apptentive statistics show that about two-fifths of users who click “no” to the love dialogue are deemed a risk and are steered away from a public review. Sefferman has been recommending the technique since at least 2016 and calls it so effective there is little excuse for having a low rating.

“The reason your app doesn’t have five stars is because the way you ask for in-app feedback is incorrect,” says an online Apptentive “how to” guide.

Google’s Android had long resisted offering in-app ratings, despite pressure from developers. Before 2017, the percentage of five-star ratings for Android apps downloaded from the Play Store was higher than Apple’s for all five categories tracked by Apptentive. But since 2017 App Store ratings have taken a commanding lead.

That is likely to change. On August 5, Android relented and began offering in-app rating prompts. Like Apple, Android says its intention is for developers to get more “honest and unbiased” feedback. But it also cites developers praising the tool for helping them achieve, as one put it, an “all-time highest rating just a week after we implemented in-app reviews” – a clear acknowledgment that developers can expect a boost in ratings irrespective of whether they actually improve their app.

Bain’s Markey says creating a marketplace with fair ratings should be critical for any platform provider. “It’s like, you have one job,” he says. “If you don’t do that, you lose buyers or you lose sellers, eventually.”

But developers and consumers face the same problem: aside from Apple and Google, smartphone users have nowhere else to go.

— Financial Times