New Scientist has an article about a software system called ChatNannies that purports to engage pedophiles in chat room conversations in an effort to catch them "grooming" children for real-life meetings. There's a transcript of one such conversation, and the creator claims the program is so effective that no one has caught on yet.
As an expert in artificial intelligence I'm extremely skeptical, and I'll explain why.
First, there are details in the article that just don't make sense.
The nanniebots do such a good job of passing themselves off as young people that they have proved indistinguishable from them. In conversations with 2000 chatroom users no one has rumbled the bots, [Jim] Wightman [the author] says. ...
Wightman currently has 100,000 bots chatting away undetected in chatrooms - the most he can generate on the four internet servers at his IT practice. He would like to build more but funding is the sticking point, as he does not want anyone to profit financially from his technology.
He's got 100,000 bots running, but only 2000 conversations in which the bot has gone undetected. That's a miserably low success rate, and actually quite believable. I suspect these numbers were intended to mean something else, but what?
Then there's Wightman's reluctance to reveal details of the system to anyone.
One of its tricks is to use the internet itself as a resource for its information on pop culture. Wightman will not reveal how it judges what is reliable information and what not. He does say, however, that each bot has dozens of parameters that are assigned at random, to give each one a different "personality". ...
"Some companies have offered fantastic sums of money, but all want technology ownership. And that's something that isn't going to happen," he says. Instead, he hopes eventually to get financial support from government-run organisations that focus on child protection.
If this is a fraud, it would be a lot easier and safer to profit from government hand-outs than to actually risk revealing the "system" to technically-savvy investors. This is why reputable scientists publish the details of their research.
Interestingly, the transcript does include a few hints that it's likely machine-generated. In this script, "B" is the purported machine and "A" is the human.
B - pancake day! i love pancakes...mmmm so tasty A - yeah me too, but i forget every damn year B - did you forget this year?
The response by B is very script-ish. Notice also that B's responses are longer on average than A's. That's a sign of a poor (i.e., standard) conversation routine. It's very hard to generate complex sentence structures that sound natural.
Here's another interaction, with my comments.
B - oh cool. did you watch robocop 2 last night? A - what side was it on? B - sky one A - we haven't got sky A - but i've seen it before A - it wasn't as good as robocop B - i agree, though it was cool in places.
Canned response, ok.
A - did you watch robocop last night B - yes, i just said i did! A - no you said you watched robocop 2 not robocop - so which one was it? B - robocop 2 - pedant!
Interesting confusion of tokens. The system splits the "robocop 2" token on its own, or assumes that A is using shorthand for the same token. This leads to confusion for the robot, which is fine, but there's no way it could be smart enough to untangle the subsequent miscomprehension. The usage of "pedant" after a dash as an exclamation feels made up. It's not a very natural chat construction, particularly for a child, and I can't imagine a robot could so easily identify the source of confusion and label it so appropriately.
A - not robocop or robocop 3 or robocop the series B - it was definitely robocop 2, the one with kain the second robocop in it. i haven't seen robocop 3 or the series.
If the system is genuine, this is a remarkable feat of comprehension. Most humans would be confused by this point.
Anyway, this conversation could be machine generated, but I suspect it's not representative of how any real system interacts with humans on a consistent basis.
Beyond all this, the creator claims the software can reliably detect pedophiles based on non-sexual conversations? No way. Human children and parents can't even do that face-to-face, and we're finely tuned to pick up on vocal, physical, and conversational cues that aren't present in text chats.
The most technologically advanced AI construct ever conceived and built. The NannieBot spawns and controls a large number of virtual internet users, whose behaviour is indistinguishable from humans interacting on the internet. The first AI construct to effortlessly pass the 'Turing Test', after more than 13 hours of conversation the AI was still undiscovered!
The only thing is, I don't see the catch. He asks for sponsors and donations, but he doesn't directly charge money for the software. Of course, the software isn't released yet, and they're auctioning off the first public chat with their robot on eBay. Maybe he really is hoping some government will fund his project?
Any system can occasionally hit a home run, but the claims in this article are not credible, in my opinion. Go here to chat with some of the best existing real chatbots; none of them are anywhere near the capabilities claimed by Wightman.
Via Apothecary's Drawer and Waxy.org I see that a grad student at MIT named Cameron Marlow managed to secure an exclusive interview with one of the NannieBots. From the transcipt he's posted it's virtually certain he was talking to a human posing as a robot.
The secret to a good scam is knowing how far you can go before you cross the line into absurdity. Jim Wightman doesn't have a clue.