CAPTCHA Concepts

Stryder · Nov 7, 2007

I just started to undertake a little project in creating a decent CAPTCHA, however admittedly I'm over complexing the plan a little mainly to try and stay one step ahead of the people responsible for the Weak AI bots.

Currently I'm looking at using images and GD library to house all the data a human requires to gain access where a bot can not, however I've come across a problem.

The problem is that the images I use although the same size in dimension are not however the same size in Bytes. So even if the same file is used to parse the image, the Byte size can be slightly different. So here's a question to image manipulators out there:

How do I go about 'packing' the images so they are all the same filesize? (I'm not worried about byte by byte comparisons, just overall filesize.) I ask this (while also researching it) because it's important in regards to giving a value that could otherwise undermine any further work done towards a CAPTCHA system.

oozish · Nov 7, 2007

use batch image converter

http://www.imageresizer.com/

Stryder · Nov 7, 2007

I don't think that Batch Image converter would manipulate the 'filesizes' to be the same, resizing the overall size of the image itself isn't the problem it's the bytesize that is.

For instance one image might be 73.45kb where as another will be 85.24kb, a bot could identify the differences between two such images easily by it's overall filesize without going to town trying to work out what the image actually is.

So I've got to find a way to add 'Packing' to the image, to bolster it by an amount so all images are set the same kb size. I'm guessing using Alpha channels is a probable must, however obviously this isn't something with many online tutorials in how to do.

Blindman · Nov 8, 2007

You are not going to be able to get the same byte size for images that are compressed because the filesize is more dependant on image content then image pixel size. You best bet is to use uncompressed image format. Another thing you could do is pad a compressed image file with random data.

Stryder · Nov 8, 2007

That's what I was thinking. I was thinking of working out from all the images the peak value of the bytesize and then trying to pack the images to that size or slightly over.

The images I intend to create will contain an image as the background, another image randomly positioned in a corner and a 4 character string located in another corner. They will all obviously alter the bytesize too. Creating a border around the outside of the image might be the best bet and then generating a random fill for byte packing. (Of course it's going to be complex coding, but it's not going to be easy for a Bot writer to get past.

Gustav · Nov 11, 2007

we shall get to the bottom of this
this i vow! a challenge!

off the top of my head..... vectors can resize without degradation
ahhhh, a link informative but not quite there

Gustav · Nov 11, 2007

/curses

its online

leopold · Nov 15, 2007

Stryder said:
I just started to undertake a little project in creating a decent CAPTCHA, however admittedly I'm over complexing the plan a little mainly to try and stay one step ahead of the people responsible for the Weak AI bots.

although this doesn't answer your question it might give you something to chew on:
http://www.archive.org/details/Christopher_Abad_Advancements_in_Anonymous_eAnnoyance

Gustav · Nov 18, 2007

(4) In Photoshop, adjust the Image >> Size. When you select Image >> Size from the menu, a dialogue box will appear. Make sure that “Constrain Proportions” and/or “Resample Image” are checked. These options will create icons of chain links on the dialogue box indicating that some characteristics are linked together. For example, if Width and Height are linked and you lower the value in one, the value in the other will automatically lower by the amount necessary to preserve your image's aspect ratio. Near the topof the Image Size dialogue box is an estimate of the file size. When you make a change, and before you click OK, the estimate will change. Play around with the check boxes and the values to get the best image with the lowest file size. As long as you don't File >> Save, your original scan won't change. If you want to preserve your resized image but not change the original scan, save the resized image with a modified file name.

Choose the Image Size command.
Check Resample.
Type 8 in the Width field for print size and observe the following: Resolution remains at 220, but the width and height double to 1760 by 2200. Also notice the file size readout. It should say 11.1M (was 2.77M). The file size quadruples as we double the pixel dimensions!

http://www-personal.umich.edu/~esrabkin/LowerImageSize.htm
http://graphicssoft.about.com/cs/resolution/a/increasingres.htm
http://www.peachpit.com/articles/article.aspx?p=433340&seqNum=5&rl=1

crude but workable. arrow key should scroll values
nn is allegedly one of the interpolations offered
what the fuck is it.? i do not have photoshop
info wanted
thanks

google query=Nearest Neighbor "file size" change

ps: the offered algorithm might be a starting point

Gustav · Nov 22, 2007

well, stryder?

leopold · Nov 22, 2007

why not have something like "5+4=" ?
that would be simpler and less costly bytewise than images.
you could even go a step further and do "(3X6)+2=".

Gustav · Nov 22, 2007

blocks change file size

Stryder · Nov 22, 2007

leopold99 said:
why not have something like "5+4=" ?
that would be simpler and less costly bytewise than images.
you could even go a step further and do "(3X6)+2=".

Actually the Image system is a sophisticated CAPTCHA, it's not the standard type. It's actually the integration of multiple image layers that are to be randomly generated from prebuilt atom's, the problem was/is that if the build image weighs a set size for a certain combination, the very effort put into trying to make something that's visually difficult for a bot to crack can be done so by a mere byte count.

The problem with Mathematical inputs is that all programming languages have mathematical functions. They just need to read the input to be able to solve the equation.

Blindman · Nov 23, 2007

What you need to do is guaranty that every image is unique. You could use a camera and a lava lamp, just capture a shot and you will have a unique picture. You no longer have to worry about a bot just scanning the byte size to match the image to code. Though this may sound somewhat crude there are many other sources of random data available.

Stryder · Nov 29, 2007

That reminds me a bit of the film 'Johnny Neumonic' (not the short story), Where Johnny uses realtime captures of random images from the television to act as the key to the encryption for the information stored in his head.

I guess what you mean is you could have a camera view a certain area of a tropical fishtank and then ask 'What fish do you see?' Of course in this instance there would need to be a program written to ID the fish available, but the Camera would show captured that second image, as apposed to a cache.

Interesting Idea Blindman

RubiksMaster · Dec 1, 2007

Just use a hex editor and pad the end with the right amount of 0xFF. You could probably write a little program with very little effort that would do this automatically. And obviously this only works to expand the filesize, so everything has to be the size of your biggest file.

I tested it with bitmaps and jpegs, and it works. It doesn't seem to alter the image at all.

CAPTCHA Concepts

Stryder

Keeper of "good" ideas.

oozish

Banned

Stryder

Keeper of "good" ideas.

Blindman

Stryder

Keeper of "good" ideas.

Gustav

Banned

Gustav

Banned

leopold

Gustav

Banned

Gustav

Banned

leopold

Gustav

Banned

Stryder

Keeper of "good" ideas.

Blindman

Stryder

Keeper of "good" ideas.

RubiksMaster

Real eyes realize real lies