Archive for the ‘development’ Category
Wednesday, April 7th, 2010
Automatic tracing of uploaded photos is one important feature coming to clker.com . I will describe briefly how far we reached, and what I hope to accomplish within the next few days until it is released.
First off, let me outline the benefits and draw backs of auto tracing. The obvious biggest benefit is no human work is put into the segmentation and vectorization process. However, it has been proven in lots of domains that when two humans segment the same picture their result will have approx. 80-85% common areas. This means that humans will disagree about the segmentation of the same image in about 15% of the image itself. That’s because the way our brain interprets an image depends on every persons own understanding and knowledge of the contents image, and that can vary a lot. Even when the human segmentation results are in a very specific context like brain segmentation, and done by radiologists the comparison between the similarity between their results was between 83-87% using a similarity measure similar to the Jaccard index. So the disadvantage of automatic segmentation is that if we were able to reach the perfect algorithm to segment an image, the user will always have around 15-20% disagreement with its result.
The results that I will show here are based on this helicopter image (also shown on the top). There are several parameters used as the inputs to the segmentation program. Most of those will either be selected by us to provide good results in average, or passed through clker’s interface in a more humanly understandable fashion. The image was reduced in size to approximately 1 mega pixel before processing.
Those two images are direct output of the vectorizing program without any modifications. The difference between them is the number of regions in each. In the image on the right, we attempted to reduce the number of resulting segments, which can help greatly especially if the user plans on editing the image after.
To see how hard it is to produce a clip art out of this segmentation, I tried to edit the image on the left (the one with larger number of segments) and delete all the sky regions. It took me around a minute to do so, just select and delete and the result is below:
However, some people will not perceive this image as a clip art. One reason is the lack of hard outlines around the object when compared to an image like the deck officer here . The other reason is the number of regions is way smaller. I will be spending the next couple of days trying to get some borders up and hopefully we’ll get some good results.
I imagine now two ways to have this feature up. The first one once you upload a raster image, clker will work on vectorizing it then you will get an email with a link to the vectorized version. The other way (which will come later in time) will be requesting a custom segmentation with the ability to adjust some of the parameters like for example requesting smaller number of regions, or different types of color enhancement including equalization, white balance correction ..etc.
Hopefully this will be running on clker by Monday next week.
Tuesday, June 16th, 2009
For the previous months, I’ve been putting my spare time towards a new feature that I believe will greatly help adding free vector images. The feature is an online tool, that helps tracing raster images. It works by uploading your raster image, then tracing it online to produce the vector in a semi-automated way.
There are many useful tools available now, however none of them is able to produce what I percieve as a clipart. The tools that I regularly use are potrace (from inside inkscape) and sometimes vectormagic. The problem with both is that if you try vectorizing a colored image they will produce a fuzzy blurry svg, which does not look anywhere close to a clip art.
I had excellent results with potrace on lineart images, and I had no luck with colored ones.
This online tool solves a big part of the colored images problem. Some of the test results are here (showing both the raster and the vectorized result):
Man wearing topper:
Lady answering phone:
Unsuccessful / Bad results:
1967 hair style:
It doesn’t work well either with lineart. potrace is a better candidate if you have lineart images.
All the raster images are in public domain. The lady answering the phone was obtained from NCI (national cancer institute website), the rest from commons.wikimedia.org .
Those images resemble what I call clip arts more than the output of vector magic or adobe illustrator. The tool produces an outline, which I colored using inkscape – so there is no expectation about coloring now
Currently, I’m adding the required services on the website to communicate with the component and as soon as they’re done I’ll have it up for everyone to play with.
Sunday, April 19th, 2009
My server has been suffering from unexplainable apache hangs. Once in a while, apache would stop accepting new connections. It will keep running in memory but all incoming connections will timeout. Since then, my only solution was a bash script that runs every minute and tries to read a text file from the website. If it fails, it would restart apache.
On Fri evening, I replaced APC with Memcache. Although I know that Memcache is little bit slower than APC due to lots of reasons including network overhead, yet it was APC’s turn to get tested in the sequence of tests I’ve been running. It seems that APC was the reason apache was hanging. I know that some big websites like facebook use APC, but maybe they are using a different version. I also know that youtube is using memcahe, so APC was one of my least suspects. The server has been running for the past 48 hours without a crash or a hang, which is more than its average. Usually, it used to hang once every 20 hours or so.
Since I had my own caching functions, that finally called APC, the amount of code changes were very small. Hopefully it will continue running without problems.
Sunday, March 15th, 2009
Lately, I’ve been having so much unexplainable troubles with apache. After a two or three days of work it suddenly stops accepting connections. A test shows that it is still running with lots of forked children.
After lots of research on the internet I found more than a dozen of possibilities. One case was apache 2.2.9 and php 5.2 combo, with php running as a module. At that point when Apache runs out of MaxClients it stops killing children as they run out of MaxRequestsPerChild. At that point, apache just sits and rejects connections. Issuing a reload did not seem to solve the problem and the only way it would start responding is by restarting the server.
The server was not rebooted approximately for 270 days, which I don’t think should pose any problem. The only problem that I know of is holes in memory pages due to allocation and freeing memory, which may result in kswap running more, however that was not the case as I didn’t see kswap jumping up when I ran top.
Rebooting the server seemed to at least partially resolve this problem. Now when I lower the MaxClients and MaxRequestsPerClient to force apache in this bug, it seems that reloading works. The ulimate test is to leave it running, and see what will happen in a week.
That made me go back to writing my own webserver project. It actually turned to be much simpler that I anticipated, and I think that I will split my spare time part to add some new features in php, and the other to finish this c web server.
Friday, February 13th, 2009
I’ve been seriously considering writing my own webserver instead of running apache. Although apache is very good, yet, sometimes I feel things will be much funner if write this website even in C. My reasoning is although PHP is fast and APC provides a further boost, yet at one point I will have to cluster a set of machines and proxy the requests because of various tweeks that need to get done in every piece of software.
So, I started considering seriously how to write a webserver that is optimized for high requests and can make better use of memory, and how to resolve the biggest problem in C, which is crashes due to memory allocation and writing outside arrays without slowing down the C speed even a bit.
I ended up writing a small library does that memory stuff, and a small webserver and I’ve been playing with them for a while and without a surprise, I found that I can serve more than 1000 pages a second very easy without attempting to optimize dealing with strings and just with basic STL string, which in my opinion is very slow.
What I still need to do is to support the CGI stuff, so I can execute the current PHP code and run it beside the C and that will give me more time to port the website pages in phases without stopping the website, or delaying running this server.
Friday, February 6th, 2009
If you didn’t know, Flex and actionscript flash imposes a limit on the bitmap size. For some reason, the max width or height cannot exceed 2048 pixels.
When applying a transformation matrix to a bitmap, it seems that Flex generates a temporary bitmap in memory, even for the area that are not going to be displayed. Since there is a limit on the bitmap size, usually it won’t resize or zoom more than a certain amount.
This problem will show if you have a graphical application that views a bitmap, which has a zoom-in feature. The solution that I found to this problem is: after calculating the transformation matrix (scaling from zoom, translation from the scroll bars), calculate the inverse of the matrix. Using your viewport co-ordinates (usually will be 0,0 and width,height) determine the area in the bitmap that will appear if the bitmap was zoomed. Copy that area to a temporary bitmap, Calculate a new transformation matrix composed only of the zoom, and apply it to the temporary image. That’s it!
Monday, January 5th, 2009
I wrote an blog entry about caching sql with php before, since then I re-wrote the php functions that I’ve been using. The main difference betweek the older functions, and what I will describe here is storing in memory versus the disk.
Tuesday, December 16th, 2008
Last week google released a framework similar to flash, with the only difference of running x86 code. They developed a sandbox to limit the type and paramters of instructions that can be run over the browser. It draws over a framebuffer, which then gets updated on the browser. They demonstrated quake, and opengl applications running inside their native client.
Although they don’t have yet a widget toolkit, yet, after I did some research, I found that it is very easy to write an SDL extension, which means that all the libraries running over SDL can be available including 3D game libraries, animation and many widget libraries.
– Just some thoughts…..
Monday, December 1st, 2008
I’ve been doing more flex testing, and my initial impression is it is faster than my expectations. I did at least 7 different K-means tests with it, and tried to implement some image processing algorithms including edge detection, gradients, blurring and it is not slow.
However, flex does not have multithreading, which makes 1. makes it harder to do longer processing 2. makes it impossible to distribute the load across CPU cores. As a result, I had to implement a timer that calls a function every small interval, and inside that function I broke my processing into smaller pieces, and had to implement several states and switch between them. It actually made my implementation much more difficult and more sophisticated. I hope that at one point Adobe will add multithreading.
Sunday, November 23rd, 2008
I think adobe made a perfect recipe by throwing the flex sdk as open source. Morever, it is platform independent as the compiler is in java. The last couple of weekends I put some time to see what’s there. I was able to make a simple dev environment with emacs both on windows and on Linux.
Coding flex on linux seemed to be easier than on windows (at least for me). One big thing is the results of the trace command, which appear on the console on linux but are piped to some hidden file in windows that I have to keep refreshing to know what’s going on inside my code!
I’ve testing flex to see how far can it go in image processing and in image undersanding. I think that flex has lots of opportunities, at least better than java when it comes to online apps. The reason that I personally don’t like java for applets is the big memory footprint of its runtime environment, and that strange big delay when it loads, which makes it a very discouraging solution when compared to adobe’s flash.
Since I was fiddeling with bitmaps and bitmapdata, and after some tests – nothing was working. It turns out that the Flex’s help it written assuming that your developing flash – they like copied it from their old docs, without changing the examples. See here & my comment at the bottom:
So if it happens that you want to display flex objects using the addChild function, make sure you call the addChild on a UIComponent & not any other flex object, as they won’t recognize bitmaps or other dynamically created objects. I just added one in the mxml file:
<mx:UIComponent id="myUIComponent" />
Also read here: