563.10.3 CAPTCHA.ppt 737KB Jun 23 2011 12:10:24 PM

563.10.3 CAPTCHA

Presented by: Sari Louis
SPAM Group: Marc Gagnon, Sari Louis, Steve White

University of Illinois
Spring 2006

Agenda








Definition
Background
Applications
Types of CAPTCHAs

Breaking CAPTCHAs
Proposed Approach
Conclusion

Definition
• CAPTCHA stands for Completely Automated
Public Turing test to tell Computers and Humans
Apart
• A.K.A. Reverse Turing Test, Human Interaction
Proof
• The challenge: develop a software program that
can create and grade challenges most humans
can pass but computers cannot

Background
• First used by Altavista in1997
– Reduced SPAM add-url by over 95%

• CMU/Yahoo!
– Automated the creating and grading of

challenges

• PARC
– Relies on document image degradation to
prevent successful OCR
– Conducted user-focused studies to assess
the effectiveness of CAPTCHAs

Background
• CAPTCHAs are based on open AI
problems
• Breaking CAPTCHAs help advance AI by
solving these open problems
• Improving CAPTCHAs help telling
computers and human apart
• Win-win situation

Background - Papers
• Pessimal Print: A Reverse Turing Test
Allison L. Coates, Henry S. Baird, Richard J. Fateman


• Telling Humans and Computer Apart
Automatically
Luis von Ahn, Manuel Blum, and John Langford

• CAPTCHA: Using Hard AI Problems for
Security
Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and John Langford

• Using Machine Learning to Break Visual
Human Interaction Proofs (HIPs)
Kumar Chellapilla, Patrice Y. Simard

Applications







Free email services
Online polls
Dictionary attacks
Newsgroups, Blogs, etc…
SPAM

Types of CAPTCHAs
• Text based
– Gimpy, ez-gimpy
– Gimpy-r, Google CAPTCHA
– Simard’s HIP (MSN)

• Graphic based
– Bongo
– Pix

• Audio based

Text Based CAPTCHAs
• Gimpy, ez-gimpy

– Pick a word or words from a small dictionary
– Distort them and add noise and background

• Gimpy-r, Google’s CAPTCHA
– Pick random letters
– Distort them, add noise and background

• Simard’s HIP
– Pick random letters and numbers
– Distort them and add arcs

Text Based CAPTCHAs

Graphic Based CAPTCHAs
• Bongo
– Display two series of blocks
– User must find the characteristic that sets the
two series apart
– User is asked to determine which series each
of four single blocks belongs to


Difference? thick vs. thin lines

Graphic Based CAPTCHAs
• PIX
– Create a large database of labeled images
– Pick a concrete object
– Pick four images of the object from the
images database
– Distort the images
– Ask the user to pick the object for a list of
words

Graphic Based CAPTCHAs

Dog

Pool

Audio Based CAPTCHAs

• Pick a word or a sequence of numbers at
random
• Render them into an audio clip using a
TTS software
• Distort the audio clip
• Ask the user to identify and type the word
or numbers

Breaking CAPTCHAs
• Most text based CAPTCHAs have been
broken by software
– OCR
– Segmentation

• Other CAPTCHAs were broken by
streaming the tests for unsuspecting users
to solve.

Proposed Approach
• Very similar to PIX

• Pick a concrete object
• Get 6 images at random from
images.google.com that match the object
• Distort the images
• Build a list of 100 words: 90 from a full
dictionary, 10 from the objects dictionary
• Prompt the user to pick the object from the
list of words

Proposed Approach - Technical
• Make an HTTP call to images.google.com
and search for the object
• Screen scrape the result of 2-3 pages to
get the list of images
• Pick 6 images at random
• Randomly distort both the images and
their URLs before displaying them
• Expire the CAPTCHA in 30-45 seconds

Proposed Approach - Benefits

• The database already exists and is public
• The database is constantly being updated
and maintained
• Adding “concrete objects” to the dictionary
is virtually instantaneous
• Distortion prevents caching hacks
• Quick expiration limits streaming hacks

Proposed Approach - Drawbacks
• Not accessible to people with disabilities
(which is the case of most CAPTCHAs)
• Relies on Google’s infrastructure
• Unlike CAPTCHAs using random letters
and numbers, the number of challenge
words is limited

Dokumen yang terkait

ANALISIS FAKTOR YANGMEMPENGARUHI FERTILITAS PASANGAN USIA SUBUR DI DESA SEMBORO KECAMATAN SEMBORO KABUPATEN JEMBER TAHUN 2011

2 53 20

KONSTRUKSI MEDIA TENTANG KETERLIBATAN POLITISI PARTAI DEMOKRAT ANAS URBANINGRUM PADA KASUS KORUPSI PROYEK PEMBANGUNAN KOMPLEK OLAHRAGA DI BUKIT HAMBALANG (Analisis Wacana Koran Harian Pagi Surya edisi 9-12, 16, 18 dan 23 Februari 2013 )

64 565 20

FAKTOR – FAKTOR YANG MEMPENGARUHI PENYERAPAN TENAGA KERJA INDUSTRI PENGOLAHAN BESAR DAN MENENGAH PADA TINGKAT KABUPATEN / KOTA DI JAWA TIMUR TAHUN 2006 - 2011

1 35 26

A DISCOURSE ANALYSIS ON “SPA: REGAIN BALANCE OF YOUR INNER AND OUTER BEAUTY” IN THE JAKARTA POST ON 4 MARCH 2011

9 161 13

Pengaruh kualitas aktiva produktif dan non performing financing terhadap return on asset perbankan syariah (Studi Pada 3 Bank Umum Syariah Tahun 2011 – 2014)

6 101 0

Pengaruh pemahaman fiqh muamalat mahasiswa terhadap keputusan membeli produk fashion palsu (study pada mahasiswa angkatan 2011 & 2012 prodi muamalat fakultas syariah dan hukum UIN Syarif Hidayatullah Jakarta)

0 22 0

Perlindungan Hukum Terhadap Anak Jalanan Atas Eksploitasi Dan Tindak Kekerasan Dihubungkan Dengan Undang-Undang Nomor 39 Tahun 1999 Tentang Hak Asasi Manusia Jo Undang-Undang Nomor 23 Tahun 2002 Tentang Perlindungan Anak

1 15 79

Pendidikan Agama Islam Untuk Kelas 3 SD Kelas 3 Suyanto Suyoto 2011

4 108 178

PP 23 TAHUN 2010 TENTANG KEGIATAN USAHA

2 51 76

KOORDINASI OTORITAS JASA KEUANGAN (OJK) DENGAN LEMBAGA PENJAMIN SIMPANAN (LPS) DAN BANK INDONESIA (BI) DALAM UPAYA PENANGANAN BANK BERMASALAH BERDASARKAN UNDANG-UNDANG RI NOMOR 21 TAHUN 2011 TENTANG OTORITAS JASA KEUANGAN

3 32 52