-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathindex.html
More file actions
308 lines (246 loc) · 21.7 KB
/
index.html
File metadata and controls
308 lines (246 loc) · 21.7 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<link rel="icon" href="favicon.ico">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- <meta name="description" content="Smart speakers collect voice input that can be used to infer sensitive information about users. Given a number of egregious privacy breaches (e.g., smart speakers constantly recording audio, outsourcing transcription to contractors, and employees listening to private and intimate interactions), there is a clear unmet need for greater transparency and control over data collection, sharing, and use by smart speaker platforms as well as third-party skills supported on them. To bridge the gap, we build an auditing framework that leverages online advertising to measure data collection, its usage, and its sharing by the smart speaker platforms. We evaluate our framework on the Amazon smart speaker ecosystem. Our results show that Amazon and third parties (including advertising and tracking services) collect smart speaker interaction data. We find that Amazon processes voice data to infer user interests and uses it to serve targeted ads on-platform (Echo devices) as well as off-platform (web). Smart speaker interaction leads to as much as 30X higher ad bids from advertisers. Finally, we find that Amazon's and skills' operational practices are often not clearly disclosed in their privacy policies."> -->
<title>In the Room Where It Happens: Characterizing Local Communication and Threats in Smart Homes</title>
<link href="bootstrap.min.css" rel="stylesheet">
<style>
.bd-placeholder-img {
font-size: 1.125rem;
text-anchor: middle;
-webkit-user-select: none;
-moz-user-select: none;
user-select: none;
}
@media (min-width: 768px) {
.bd-placeholder-img-lg {
font-size: 3.5rem;
}
}
.profiles {
display: flex;
justify-content: space-evenly;
flex-wrap: wrap;
}
.profile > img {
height: 200px;
width: auto;
box-shadow: 0 1px 3px rgb(0 0 0 / 12%), 0 1px 2px rgb(0 0 0 / 24%);
margin-bottom: 0.75em;
border-radius: 6px;
}
#logos {
display: flex;
justify-content: space-evenly;
align-items: center;
flex-wrap: wrap;
}
#davis-logo {
width: 200px;
margin: 20px;
height: auto;
}
#ppd-logo {
width: 180px;
margin: 20px;
height: auto;
}
#appcensus-logo {
width: 220px;
margin: 20px;
height: auto;
}
#irvine-logo {
width: 160px;
margin: 20px;
height: auto;
}
</style>
</head>
<body>
<div class="col-lg-8 mx-auto p-5 py-md-12">
<header class="d-flex pb-3 mb-4 border-bottom d-flex justify-content-center">
<a href="/" class="text-dark text-decoration-none">
<span class="fs-4 d-flex justify-content-center fw-bold">In the Room Where It Happens</span>
<span class="fs-4 d-flex justify-content-center fw-bold">Characterizing Local Communication and Threats in Smart Homes</span>
</a>
</header>
<main>
<p class="fs-6 col-md-12">
<!-- Smart speakers collect voice input that can be used to infer sensitive information about users. Given a number of egregious privacy breaches (e.g., <a href="https://money.cnn.com/2017/10/11/technology/google-home-mini-security-flaw/index.html">speakers constantly recording audio</a>, <a href="https://www.usatoday.com/story/tech/2019/07/11/google-home-smart-speakers-employees-listen-conversations/1702205001/">outsourcing transcription to contractors</a>, and <a href="https://www.bloomberg.com/news/articles/2019-04-10/is-anyone-listening-to-you-on-alexa-a-global-team-reviews-audio">employees listening to private and intimate interactions</a>), there is a clear unmet need for greater transparency and control over data collection, sharing, and use by smart speaker platforms as well as third-party skills supported on them. To bridge the gap, we build an auditing framework that leverages online advertising to measure data collection, its usage, and its sharing by the smart speaker platforms. We evaluate our framework on the Amazon smart speaker ecosystem. Our results show that Amazon and third parties (including advertising and tracking services) collect smart speaker interaction data. We find that Amazon processes voice data to infer user interests and uses it to serve targeted ads on-platform (Echo devices) as well as off-platform (web). Smart speaker interaction leads to as much as 30X higher ad bids from advertisers. Finally, we find that Amazon's and skills' operational practices are often not clearly disclosed in their privacy policies. -->
<br><br>
We provide a brief FAQ below to highlight some of our findings. We suggest reading the <a href="https://dspace.networks.imdea.org/handle/20.500.12761/1746">full paper</a> for more details. Please reach out to us if you have additional questions.
</p>
<hr class="col-12 col-md-12 mb-12">
<!-- <div class="row g-12">
<div class="col-md-12">
<span class="mb-0 fs-4 d-flex align-items-center text-dark FAQ"> <a class="text-dark text-decoration-none fw-bold" href="#FAQ">Frequently asked questions</a></span>
</br>
<p class="fs-5 fw-bold pt-2">What was the motivation behind this research?</p>
<p>The convenience of voice input has contributed to the rising popularity of smart speakers, such as Amazon Echo (powered by Amazon Alexa), but it has also introduced several unique privacy threats. Many of these privacy issues stem from the fact that smart speakers record audio from their environment and potentially share this data with other parties over the Internet — <a href="https://www.petsymposium.org/2020/files/papers/issue4/popets-2020-0070.pdf">even when they should not</a>. For example, smart speaker vendors or third-parties may infer users' sensitive physical (e.g., age, health) and psychological (e.g., mood, confidence) <a href="https://machinelistening.exposed/library/Rita%20Singh/Profiling%20Humans%20From%20Their%20Voice%20(38)/Profiling%20Humans%20From%20Their%20Voi%20-%20Rita%20Singh.pdf">traits from their voice</a>. In addition, the set of questions and commands issued to a smart speaker can reveal sensitive information about users' states of mind, interests, and concerns. Despite the significant potential for privacy harms, users have little-to-no visibility into what information is captured by smart speakers, how it is shared with other parties, or how it is used by such parties. Our goal is to provide this visibility, allowing consumers to better understand the privacy risks of these devices and the impact of data sharing on people's online experiences.
</p>
<p class="fs-5 fw-bold pt-2">How did you do this research?</p>
<p>We built an auditing framework that measures the collection, usage, and sharing of Amazon Echo interaction data. First, we created several interest personas and one control persona to use the Echos (one persona per Echo device). Interest personas installed and interacted with skills from specific categories, while the control persona did not install or interact with skills.
</br></br>
We then measured data collection by intercepting network traffic from Amazon and skills on the Echo device to endpoints (such as Amazon's server or third-party servers). We measured profiling by directly downloading personas' advertising interests from Amazon. We inferred data usage by observing ads targeted to our Echo personas on the web (ads on websites) and on the Echo devices (audio ads). We checked the consistency of data collection and its usage by analyzing public statements and privacy policies from Amazon and third-party skills.
</p>
<p class="fs-5 fw-bold pt-4">What are the key findings of your research?</p>
<ul>
<li><i>Which organizations collect and propagate user data?</i>
<p>Amazon Echo interaction data is collected by both Amazon and third-parties (such as advertising and tracking services). We found that as many as 41 advertisers sync (share) their cookies with Amazon. These cookies are typically linked to personal information. We find that these advertisers further sync their cookies with 247 other third parties, including advertising services.</p></li>
<li><i>Is voice-derived data used by either Amazon or third-party apps beyond purely functional purposes, such as for targeted advertising?</i>
<p>Amazon processes voice data to infer user interests. Our measurements indicate that Amazon infers advertising interests from voice data and uses those interests for on-platform audio ads and off-platform web ads from Amazon or its advertising partners. For example, in our measurements we find that advertisers bid as much as 30X higher for Echo personas that install and interact with Alexa skills. It is unclear if third-party skills infer user interests and target personalized ads.
</p></li>
<li><i>Are data collection, usage, and sharing practices consistent with the policies of Amazon and third-party skills?</i>
<p>Our measurements indicate that Amazon's and third-party skills' operational practices are often not clearly disclosed in their policies or other claims. For example, Amazon's inference of advertising interests from users' voice interactions seems to be inconsistent with their public statements. Specifically, in statements to the <a href="https://www.nytimes.com/2018/03/31/business/media/amazon-google-privacy-digital-assistants.html">New York Times (NYT)</a> and <a href="https://www.nbcmiami.com/news/local/are-smart-speakers-planting-ads-on-our-social-media-profiles/157153/">National Broadcasting Company (NBC)</a>, Amazon mentioned that they <i>"do not use voice recordings to target ads."</i> Only 10 third-party skills (2.2% of the sample) are clear about data collection practices in their privacy policies. More than 70% of the third-party skills do not mention Alexa or Amazon in privacy policies.
</p></li>
</ul>
<p class="fs-5 fw-bold pt-4">Is Amazon transparent about using Echo interaction data?</p>
<p>Amazon's inference of advertising interests from users' voice is potentially inconsistent with their public statements. Specifically, in statements to the <a href="https://www.nytimes.com/2018/03/31/business/media/amazon-google-privacy-digital-assistants.html">New York Times (NYT)</a> and <a href="https://www.nbcmiami.com/news/local/are-smart-speakers-planting-ads-on-our-social-media-profiles/157153/">National Broadcasting Company (NBC)</a>, Amazon mentioned that they <i>"do not use voice recordings to target ads."</i> While Amazon may not literally be using the "recordings" (as opposed to transcripts and corresponding activities), our results suggest that they are processing voice recordings, inferring interests, and using those interests to target ads. This distinction between voice recordings and processed recordings may not be meaningful to many users.
</br></br>
Amazon's <a href="https://www.amazon.com/gp/help/customer/display.html?nodeId=GX7NJQ4ZB8MHFRNJ">privacy policy</a> does not explicitly acknowledge or deny the usage of Echo interactions for ad targeting. Similarly, <a href="https://www.amazon.com/Alexa-Privacy-Hub/b?ie=UTF8&node=19149155011">Alexa Privacy Hub</a> and <a href="https://www.amazon.com/gp/help/customer/display.html?nodeId=201602230">Alexa Device FAQs</a>, which explain how Alexa data is used, also do not explicitly mention Echo interactions for ad targeting. This is concerning given that prior public statements may lead consumers to falsely believe that such voice-based interactions are <i>not</i> used for targeted ads.
</p>
<p class="fs-5 fw-bold pt-4">Does Amazon share voice recordings with third parties, including advertising networks?</p>
<p>We did not study whether Amazon directly shares voice recordings or transcripts with advertising networks (as opposed to inferences from voice interactions). Amazon's <a href="https://developer.amazon.com/en-US/docs/alexa/workshops/build-an-engaging-skill/design-vui/index.html">developer docs</a> state that only processed transcriptions of voice input (not the audio data) are shared with third-party skills.
</p>
<p class="fs-5 fw-bold pt-4">Everyone knows they are tracked for ad targeting, why are smart speakers / voice assistants special?</p>
<p>"Voice assistants" conjure notions of devices that serve consumers personally. But the reality is that they are far from human personal assistants in that they are controlled by, and share data with, the voice assistant providers and other parties they interact with. The goal of this work is to help consumers understand the impact of using these devices that might otherwise be considered different from other online technologies. We generally know we're being tracked and that our data is used for ads when we browse the web. Some people may be surprised to learn this about voice assistants, however, in part because voice interactions were traditionally between humans, not machines. Studies like ours help to bring transparency into the space of voice assistants and the implications of using them.
</p>
<p class="fs-5 fw-bold pt-4">Does Amazon secretly record users' conversations?</p>
<p>We did not study whether Amazon surreptitiously records users' voices when they have not engaged with Echos. We specifically looked at how data derived from intentional voice commands, which are expected to be recorded, is used for advertising purposes. <a href="https://www.petsymposium.org/2020/files/papers/issue4/popets-2020-0070.pdf">Prior work</a> found no evidence of continuous recording or secret keywords that led to unexpected recording. We do find evidence that Amazon processes voice recordings from skill interactions to infer user interests and that it uses those interests to target ads.
</p>
<p class="fs-5 fw-bold pt-4">What can users do to protect their privacy?</p>
<p>Users can opt-out of interest-based ads from Amazon on its <a href="https://www.amazon.com/adprefs">Advertising Preferences Page</a>. Users can also access additional privacy controls managed through Settings > Alexa Privacy in the Alexa app or visit <a href="https://www.amazon.com/alexa-privacy/apd/rvh">Review Voice History Page</a> to view and delete voice recordings. To manage third-party skills advertising preferences, users will need to go to each skill's app or website. Amazon also allows users to download their data (including their advertising interests) from <a href="https://www.amazon.com/gp/privacycentral/dsar/preview.html">Request My Data Page</a>.
<br><br>
Note that we did not test the effectiveness of these controls.
</p>
<p class="fs-5 fw-bold pt-4">What is your response to Amazon's statement on your study?</p>
<p><figure>
<p class="fs-7 fst-italic text-muted">"Many of the conclusions in this research are based on inaccurate inferences or speculation by the authors, and do not accurately reflect how Alexa works. We are not in the business of selling data and we do not share Alexa requests with advertising networks. Similar to what you'd experience if you made a purchase on Amazon.com or requested a song through Amazon Music, if you ask Alexa to order paper towels or to play a song on Amazon Music, the record of that purchase or song play may inform relevant ads shown on Amazon or other sites where Amazon places ads. Customers can opt out of interest-based ads from Amazon at anytime on our website."</p>
<figcaption class="blockquote-footer"><cite title="Source Title">Amazon</cite>
</figcaption>
</figure>
We welcome critiques of our research methodology and our findings, but Amazon's statement does not directly address our findings. Specifically, we find that Echo devices running Alexa skills communicate with advertising services (Section 4.2). We find that Amazon infers users' advertising interests from their Echo <i>interactions</i> (Section 6). We find that Amazon's advertising partners sync (share) cookies with Amazon, and that Amazon's partner advertisers bid more than non-partner advertisers to place ads for Echo personas (users) that install and interact with Alexa skills (Section 5.5). We also find that Amazon's and Echo skills' operational practices are often not clearly disclosed in their privacy policies (Section 7).
<br><br>
We do not claim that Amazon directly shares voice input/transcripts with advertising networks.
<br><br>
Amazon's statement tells users that it serves ads based on Echo interactions and that users can opt out of interest-based ads. This confirms our conclusion that it indeed uses interests inferred from users' <i>interactions</i> with Echos for behavioral advertising. Amazon does not refute our claims that it also shares users' interests with its advertising partners.
</p>
</div>
</div> -->
<hr class="col-12 col-md-12 mb-12">
<div class="row g-12">
<div class="col-md-12">
<span class="mb-0 fs-4 d-flex align-items-center text-dark RT"> <a class="text-dark text-decoration-none fw-bold" href="#RT">Research team</a></span>
</br>
<div class="profiles my-5">
<div class="profile">
<img src="img/aniketh.jpg">
<p>
<a class="d-flex justify-content-center" href="https://www.anikethgirish.in">Aniketh Girish</a>
</p>
</div>
<div class="profile">
<img src="img/tianrui.jpg">
<p>
<a class="d-flex justify-content-center" href="https://hutr.info">Tianrui Hu</a>
</p>
</div>
<div class="profile">
<img src="img/vijay.jpeg">
<p>
<a class="d-flex justify-content-center" href="https://viz-prakash.github.io">Vijay Prakash</a>
</p>
</div>
<div class="profile">
<img src="img/daniel.png">
<p>
<a class="d-flex justify-content-center" href="https://www.danieldubois.org/">Daniel Dubois</a>
</p>
</div>
<div class="profile">
<img src="img/srdjan.jpg">
<p>
<a class="d-flex justify-content-center" href="https://software.imdea.org/es/people/srdjan.matic/">Srdjan Matic</a>
</p>
</div>
<div class="profile">
<img src="img/danny.jpg">
<p>
<a class="d-flex justify-content-center" href="https://mlab.engineering.nyu.edu">Danny Yuxing Huang</a>
</p>
</div>
<div class="profile">
<img src="img/serge.jpeg">
<p>
<a class="d-flex justify-content-center" href="https://www.icsi.berkeley.edu/icsi/people/egelman">Serge Egelman</a>
</p>
</div>
<div class="profile">
<img src="img/joel.jpg">
<p>
<a class="d-flex justify-content-center" href="http://pages.cpsc.ucalgary.ca/~joel.reardon/">Joel Reardon</a>
</p>
</div>
<div class="profile">
<img src="img/juan.jpg">
<p>
<a class="d-flex justify-content-center" href="https://cosec.inf.uc3m.es/people/juan-tapiador/">Juan Tapiador</a>
</p>
</div>
<div class="profile">
<img src="img/david.png">
<p>
<a class="d-flex justify-content-center" href="https://david.choffnes.com/">David Choffnes</a>
</p>
</div>
<div class="profile">
<img src="img/narseo.jpg">
<p>
<a class="d-flex justify-content-center" href="https://networks.imdea.org/team/imdea-networks-team/people/narseo-vallina-rodriguez/">Narseo Vallina-Rodriguez</a>
</p>
</div>
<br> <br>
<div id="logos" class="pt-5">
<a href="https://networks.imdea.org/">
<img src="logos/imdea-networks-white background.png" id="davis-logo">
</a>
<a href="https://www.uc3m.es/Home">
<img src="logos/uc3m.jpeg" id="davis-logo">
</a>
<a href="https://www.khoury.northeastern.edu">
<img src="logos/khoury-cs.png" id="davis-logo">
</a>
<a href="https://engineering.nyu.edu">
<img src="logos/nyu.png" id="irvine-logo">
</a>
<a href="https://www.berkeley.edu/">
<img src="logos/berkeley.png" id="davis-logo">
</a>
<a href="https://icsi.berkeley.edu/">
<img src="logos/icsi.jpeg" id="davis-logo">
</a>
<a href="https://software.imdea.org/">
<img src="logos/imdea-software.svg" id="davis-logo">
</a>
<a href="https://www.ucalgary.ca/">
<img src="logos/calgary.png" id="davis-logo">
</a>
<a href="https://properdata.eng.uci.edu">
<img src="logos/proper-data.png" id="ppd-logo">
</a>
<a href="https://appcensus.io/">
<img src="logos/appcensus.webp" id="appcensus-logo">
</a>
</div>
</div>
</div>
</main>
<footer class="pt-3 text-muted border-top">
The project is partially funded by the NSF ProperData award (SaTC- 1955227, CNS-2219867), EU H2020 grant TRUST aWARE (101021377), Atracciòn de Talento grant (Ref. 2020-T2/TIC-20184), PRODIGY Project (TED2021-132464B-I00) funded by MCIN/AEI/10.13039/501100011033 the European Union NextGenerationEU/PRT, and the grant PID2022-142290OB-I00, funded by MCIN/AEI/10.13039/501100011033 and by the ESF+ </footer>
</div>
</body>
</html>