Nasir Pasha & Matt Staub

The Consequences of Scraping Data From A Competitor [e221]

The guys discuss the lawsuit filed by PhantomAlert against Waze concerning accusations of data scraping a database.


NASIR: All right. Welcome to our podcast where we cover business in the news and add our legal twist. My name is Nasir Pasha.
MATT: And I’m Matt Staub.
NASIR: And here we are today on another episode. Today’s Monday – my favorite episode day, second to Wednesday.
MATT: In the top two.
NASIR: Yeah, top two of the week.
MATT: Well, that’s good. You’re the one that kind of discovered this.
NASIR: Yeah.
MATT: Were you familiar with it beforehand?
NASIR: Actually, it’s funny enough, how I found out about this, I happened to look at Google Maps and I was navigating somewhere and it said that there was a traffic incident reported by Waze and Waze is kind of like a navigating app but it’s really cool on road trips because what it’ll do is it’ll tell you if there is an accident in front of you or if there is a cop, a speed trap, and how it works is that you can actually report – like, if you see a police officer, you can say, “Okay, I just saw a police officer,” and hit a button and then it’s basically reporting it to the app and now everyone else sees it and so now there’s this kind of social aspect to reporting the traffic and different incidences or even attractions and so forth. And so, their data became so valuable because of the users that Google Maps actually acquired them for their data, of course, to integrate within Google Maps. Those of you who use Google Maps pretty regularly, you’ve already noticed in the past six months how much more information you have as far as traffic data. I remember in San Diego, it used to show those red, yellow, and green lines for traffic data only on highways because that’s the only way it had sensors. If you were in another city that wasn’t as advanced, you would have no traffic data whatsoever. But, now, you have traffic data on side streets and pretty much every street that has enough people based upon this kind of reporting data from Waze and other sources as well.
MATT: I think that’s pretty common. At least I’m one of the few people that uses – or at least I feel like I am – that have an iPhone and use the actual Maps app that comes on there. I mean, people complain about it all the time. I never have problems with it. It works just fine for me.
NASIR: I think it had problems in the beginning but that’s it, you know, because I think when Apple decided to, if you recall, I think at one point they said, “Okay, we’re going to not list Google Maps on the store at all,” and they had some backlash with that so I think it was more of a PR thing that anything else.
MATT: Yeah, and I guess Waze, if you’re stuck in a traffic jam, you let people know so, maybe down the road, when you’re trying to decide where to go, someone might do the same for you.
NASIR: Well, what’s neat about it is, if you have the program running – I think this is how it works – it’ll actually record how fast you’re going and things like that so it can actually record average traffic pace.
MATT: Yeah, that’s what I was trying to figure out. Is that what all these little weird creatures are that look like Kirby?
NASIR: Yeah, Kirby, from Nintendo, I believe.
MATT: Is that what these things are?
NASIR: Uh, I guess. They’re basically little dialogue bubbles with smiley faces on it, if you can picture that, if you’ve never seen them before.
MATT: Yeah, but some of them are Kirbys and then there’s like a dog. Anyways…
NASIR: Yeah.
MATT: There’s Waze and there’s another site called Phantom Alert which looks to be a similar thing. one of the problems I have with them is it’s like, “Oh, DUI checkpoint.” It’s like, “Well, if someone should get a DUI, we don’t need to be telling people.
NASIR: Yeah, speeding is one thing, but DUI traps are a different thing, that’s true.
MATT: You’re causing more harm than you are doing good by letting people avoid DUI checkpoints. Anyway, I’m on the Phantom Alert site as well, looks pretty similar, same sort of concept, a lot more icons but, yeah, we’re talking about a couple of similar ideas. Waze had approached Phantom Alert a few years back, in 2010, about some possible sharing databases because, like I said, it’s similar ideas. Phantom Alert said they weren’t interested basically because it looks like they gave the response of “we have a better database than you do so what are we getting out of this relationship?” I guess Waze allegedly didn’t take that too well and what they’re being accused of now is scraping the data from Phantom Alert’s site in order to basically beef up their database and their user base and that’s what eventually got them – assumed or at least in part – eventually got them bought out by Google. The interesting thing about this is Phantom Alert is saying, “You know how we know about this is we put all these fake points of interest on our database that weren’t real and – guess what – they got copied onto Waze’s database which is something that would have never happened if they were real but, you know, that’s the way we were able to figure this out.”
NASIR: Yeah, it looks like they got caught red-handed. I think, in computer terms, don’t they call it like a honey trap?
MATT: Fishing?
NASIR: No, we’re not fishing here. It’s some kind of honey trap I think is the term or what-have-you. They do this with spamming, too. They’ll set up fake email accounts that you register and I do this all the time, too. I put specific email addresses because I have a catch-all email address for every single registration. If I start getting spammed from some other third party, I’ll know where it actually came from and where my data was stolen. They do the same thing with this and it looks like they actually took this data and now they’re being sued which makes sense because, if you think about it, from reading the complaint, it seems like this happened a while ago but only now is this lawsuit being filed – of course, after they’ve been acquired by Google. Of course, if you look at the actual complaint which I’m looking at now, it includes both Google and Waze as defendants. Of course, if they were just suing Waze, it may not have been as lucrative of a lawsuit.
MATT: Well, Waze, I believe, is free-to-use, right? So, I don’t know how much revenue they’re even producing. Obviously, Google is producing a significant amount of revenue.
NASIR: I think they’re making money now, Google. Are they cash flow positive yet? I know they were a start-up not too long ago.
MATT: They have enough money to create a new logo that people are going crazy about but I don’t really understand it. But, yeah, you’re right because this was done in 2012 – or at least what they’re alleging occurred back in 2012 – or at least when it started.
NASIR: They specifically refer to dates like in June 2013. They’ve been sitting on this and, let’s see, Google acquired them when? I believe it was this year. Here’s the thing about the data that was actually stolen. I have literally run into this with different kinds of clients and so forth because there’s this weird kind of concept that people still don’t get about when there’s data online that, because it can be accessed by the public, I can use it and copy it and use it for my own purposes however I want, and that’s just not true. It seems obvious from a legal perspective but I can see how people may be thinking… a good common misconception for example, when early internet was copying images. And so, when people had different websites, they would just search images and then copy and paste it into their website and voila! They have their website done but, of course, that’s simple copyright infringement.
MATT: Yeah.
NASIR: How they probably did it – and I don’t know for sure – is probably using a method called “data scraping” where it’s basically similar to what Google does. You have certain scripts that go to a website and basically accesses the public data that is presented through web page or through some back channels or what-have-you and copies that data and puts it into their own system. Of course, if you don’t know which ones the fake points of interest are then you’re going to copy the fake ones in there. And so, this exact same kind of scenario happened with a site – actually, it was an attack on Craigslist – it was this other company, 3taps. I’m not sure about the website that they ran. Basically, they were copying the Craigslist listings. Actually, Craigslist ended up losing because of some kind of loophole. Basically, just like Waze and Phantom Alert, their data was data submitted by users like you and I. and so, the question of, “Okay, if we’re the ones submitting it, how does Craigslist or how does Waze or how does Phantom Alert own this content?” And so, Craigslist and these other companies too, they said that, “We have an exclusive license to this data,” – except, in this Craigslist case, they happened to forget to put the word “exclusive” in there. Then, when you say that, then it becomes a question of whether you own the right or not and so, but assuming that Phantom Alert had an exclusive license to this data and they aggregated all this data that is submitted from users like you and I, and they submit it and Waze takes it, that can be construed as maybe copyright infringement but there’s also aspects of conversation there. There’s different legal theories that can apply and that’s basically what Phantom Alert did. They sued for conversion and they sued for copyright infringement which was the two legal theories that probably fit best for this.
MATT: I mean, there’s ways to go about it where it’s possible they can get away with it – now, I’m not saying in this specific instance but in general – with the way data scraping works. But, in this instance, it looks like they just pretty much straight ripped off… I mean, especially because they’re doing the same thing so they just ripped off their site and used that information to essentially get bought out by Google or acquired by Google. I think a lot of the ones we discuss, I think Google will end up paying Phantom Alert some amount of money and do whatever because we don’t even know for sure how much of the data they took from Waze, correct?
NASIR: Yeah, and it could be very miniscule data. Or who knows? Maybe Waze has a defense? I was thinking maybe because both are user-submitted content, somehow there were other users that were taking one data from one set to another but they have all these different algorithms, too. I was reading from both Waze and Phantom Alert, they have algorithms to determine fake places of interest and real ones based upon how many people are submitting to the site, how many people confirm it, whether it’s accurate or not, and things like that and the timing and who’s doing it and all that. And so, assuming the allegations are true, it would seem strange and who knows exactly what happened but these are pretty specific allegations that that fact that these fake points of interest are on the Waze database is pretty damning in itself. I’m not sure how Waze or Google for that matter is going to get out of that and I would suspect that any of these instances that happen before Waze was acquired. Most likely, Waze would have to indemnify Google for that. And so, I suspect that, since Waze was just acquired, that if there was any kind of liability, it’s going to come back to Waze and it’s going to come out of their acquisition money. Where else would it come from, you know?
MATT: Yeah, and if it’s all user-submitted data then what does it really matter? But, supposedly, Phantom Alert does have this systematic process of identifying points of interest for users, et cetera. So, it’s more than just – I don’t know what it is – but it’s more than just users seeing something, putting it on this grid. That’s what they’re really claiming is being compromised.
NASIR: If you think about, both their businesses – both Waze and Phantom Alert – are acquisition plays, right? I mean, there’s only so much they can do with that data because, if you have to log into their software to use their app, it may not have all the bells and whistles that Google Maps has and vice versa. And so, Google Maps or Apple Maps is a great acquisition target for them. And so, when their competitor, Waze, goes and gets acquired by Google on the so-called backs of Phantom Alert’s data, I can understand why they may be upset. I mean, I don’t know if it’s from the complaint or the press release, they describe how Mr. Scott who is apparently the founder, he said, “I started Phantom Alert seven years ago as an entrepreneur with a dream and now that dream has been crushed by companies that are profiting from years of blood, sweat, and tears our team put into our product.” A little dramatic, but I can understand that kind of position.
MATT: Yeah, it’s frustrating but, to be honest, the Waze, well, I think they’re both kind of stupid but I think Waze’s looks better than Phantom Alert’s. Phantom Alerts is just all school zones.
NASIR: Too busy, right?
MATT: Well, the thing I don’t understand – because I’ve never used this – how do you know about these? Does it overlay on whatever maps app you’re using?
NASIR: No, it has its own map – well, at least Waze does. It had its own map software. I’m sure the app is still running and I think it actually uses Google Maps to navigate but it’s still under its own software.
MATT: It does now, I’m assuming at least.
NASIR: Yeah, it does now, I’m sure. Phantom Alert tends to focus on speed traps, speed cameras, like you said, DUI checkpoints and red-light cameras and so forth. Waze does that too but it doesn’t focus on that. It has more of a focus on traffic and things like that. I mean, they’re still competitors but, just by looking at the two apps, Waze does seem to be a little bit better and I heard of Waze years before. I’ve never heard of Phantom Alert. If that’s anecdotal to how popular Phantom Alert was, I don’t know.
MATT: Yeah, Phantom Alert is more for police enforcement which, if the police were smart about it, they would just put on fake data points themselves and get people to drive through the areas where they’re really at.
NASIR: But that’s where the algorithm comes in.
MATT: I’m not sold on it.
NASIR: Yeah, but what about the fact that they may have took the data? Even if it’s not that great of a company or that great of a software or what-have-you, if they took the data and that was part of the reason why Google, I mean, that’s the reason why Google bought them – their data. I mean, not their software because they integrate that data into the Google Maps as a perfect fit.
MATT: Well, I mean, there’s only so many ways you can figure out something on a map.
NASIR: What do you mean?
MATT: You didn’t pick up on that one, I guess.
NASIR: No, I don’t get it.
MATT: There’s only so many ways.
NASIR: Oh, so many ways. That was a great, great joke. We should write that down. Use that again.
All right. Well, I think that’s our episode on copyright infringement and scraping data. Pretty common in business now.
MATT: Yeah, I mean, although we only talked about these map apps or these map functions, it actually applies to a lot of businesses, I could say in terms of the data scraping side of it – or it could.
NASIR: Oh, yeah.
MATT: It’s more than just these maps.
NASIR: There are businesses, even amongst our clients, if you think about it, that are based upon gathering data. They have to be very careful on how they do that because, even accessing a site multiple times for the purposes of scraping can have its liability issues, let alone doing it and then copying it and then displaying it again on your site which is a whole different issue. One thing I noticed is that, in the complaint, it doesn’t describe how Waze actually got the data and I think that may come out later in discovery and may cause their causes of action.
MATT: Yeah, definitely.
NASIR: Okay. Well, thanks for joining us, everyone.
MATT: Keep it sound and keep it smart.


