Category Archives: random thoughts

That one month

Found a notes. Between writing thesis, Bioshock and Korean drama. Back in 2007 when myself looks more intelligent than now.

Writing a thesis involve many things, (to simplify) thats include yourself and the rest of the world. Yourself : your mood, your determination, your spirit, your body, your eyes etc. The rest of the world : Your computer and its contents, your programs, your Latex editor, your friends, your family, your Lab, the internet etc.

$ ======================================== $
Saturday, September 8th 2007, Gwangju, South Korea.
5:10 PM

@ flashnews
I just finished writing my bibtex entry. It’s about 118 papers or less and adding. I have started playing Bioshock and downloading some TV shows for distraction.


@ content
Based on my bibtex, basically it contains all things that I want to do, so depending on what I implement, maybe I will not include all of it. But what it is? that I will implement? My original intention is to put all what we’ve done during Samsung project but that will take credits of my team because most of the part, they develop it. not me. My contribution is just in the beginning phase. and the rest is them that’s include moving object detection, human detection and tracking. So none of my works there!

I really want to include this in my thesis
“This dissertation is the result of my own work and includes nothing which is
the outcome of work done in collaboration except where specifically indicated
in the text”

Programming wise, this is what we have from Samsung project.

I. Devices : handle camera to get images from sensor into IplImage
1. Camera. interface for prosilica camera that we use now

II. Vision : Several algorithms that processed input IplImage sequentially
1. Panorama. This class convert omnidirectional image into panoramic by using LUT
2. OpticalFlow. Estimate optical flow using Lucas-Kanade method
3. MovingObjectDetector. Detect moving object by using flow compensated frame differece
4. HumanDetector. Detect human is moving object area. This implementation use SvmClassifier and SvmUtil.
5. RegionTracker. Implement traking by using mean-shift based tracker


My thesis should improve the Samsung implementation :
That is in
1. Image enhancement part
2. Moving detection part -> automatic threshold or other methods
3. Human detector -> feature and classifier
4. Tracker -> feature and method.

Enhancement is standard procedure. for now I only have homomorphic and gamma correction. that should suffice. Illumination invariant which I like to solve but lowest priority.

Motion detector is based on x projection histogram of the compensated frame difference image. Compensated in sense that window motion estimated by optical flow is translated back into the initial position. and any independent motion will disrupt the compensation so we have dominant motion area caused by object or something else, thats include high reflectance surface and high clutter object. because basically, mostly detected motion based on above method is caused by edges. And that is really a problem in cluttered area where edges are everywhere. When it tested in lab sequence in detect motion almost everywhere!. I don’t care about robot motion here. Robot should move naturally and not engineered. maybe because I use big window (20×20) but I decided on that to make system faster. smaller window will have more points to evaluate and that will take a long time.

Despite the problems I really happy with how we compensate motion without any notion on geometric and whatsoever. I mean like many egomotion estimation. Although I felt that’s how you should do it. Lift the image into its original representation, like half sphere and then perform analysis in that representation. But that is too difficult.

Because it’s relied on segmentation on optical flow, one way is to segment optical flow vector. However, is the window is too big, maybe the result is not too good. However, this is one alternative and worth trying. Things like attention also interesting.

In my opinion, detector and tracker should use same feature. I like the idea of region covariance because in can be used for detection and tracking simultaneously and the result is convincingly good. I am thinking on orientation + intensity histogram + PCA + other stuff.

Like I always believe, no matter how good you’re idea is, without implementation is nothing. So better start working now.

@ writings
As for writing, is to use multibib package for multiple bibliography file in Latex.

Title :
Development of Omni directional Vision-Based Human Detection Module for Mobile Robot

The presentation structure : (tentative)
1. Introduction
2. Visual perception and omni directional vision : overview and related works
3. Where to Look? : visual attention by moving object detection
4. Is it human? : from moving object to human hypothesis verification
5. Where is he now? : tracking human
6. Evaluation
7. Conclusion and Remarks

$ ======================================== $
Sunday, September 9th 2007, Gwangju, South Korea.
2:47 AM

@ flashnews
I redo Medical Pavilion stage because apparently if you die while fighting the big daddy and you load your last save, then the little sister is vanished into thin air. So you have to fight the big daddy again and again. and also because I replace my electric plasmid which is a mistake because you cannot buy a electric shotgun rounds which is handful to kill the bigdaddy. So I redo all again. Damn it.

@ writings
screw multibib. I think it should worked. I made mistake by not enclosing the braces in of the bib entry and it screw it all. So I just include all bib entry in single file and forget the rest.

As I want to put some changes to make it more originally mine, I face a new challenge: make the programs. I think this should be the priority for two days or so before I can write something. Because the idea is not so clear now.
What is the spatio-temporal attention? how we can detect the not moving human? and not over do it when the human is moving?
What is the probabilistic bayesian machine for detecting and tracking?
What is the robust feature?

I think I should rethinking what I write. Be careful of what you wish for.

$ ======================================== $
Sunday, September 9th 2007, Gwangju, South Korea.
7:44 PM

@ flashnews
work all night and sleep all day. what a life

@ contents
There are several ideas that I want to implement
this includes
– orientation histogram shape backprojection image as saliency feature to reject all non-human shaped area
– Sooyeong simple saliency algorithm
– working with downsampled image
– optical flow grouping
$ ======================================== $
Sunday, September 9th 2007, Gwangju, South Korea.
11:37 PM

@ contents
down sampling is just great, by scale factor 2, reduced from 200 ms into 40 ms for optical flow!

backprojection is ok, but the orientation histogram of upper body  is really discriminative, so backprojection result still not significantly reduced spatial search space

based on my observation I think the best optical flow estimation is using window size twice the grid step
If grid step is 10, then winsize is 20. is smooth and less affected by noise. except the calculation jump as much as twice or three times in case it same.

$ ======================================== $
Tuesday, September 11th 2007, Gwangju, South Korea.
10:55 PM

@ flashnews
Doing good on Bioshock…. nice

@ contents
Saliency on one level on one kernel. not really good. maybe multiple scales will be better. But I have to understand the multi scale theory first.
And color saliency is just not soo good….
maybe it doesn’t have any effect…

$ ======================================== $
Thursday, September 13th 2007, Gwangju, South Korea.
4:02 PM

@ flashnews
Ramadhan mubarak.

Bioshock….. I think I will play again after this one

@ contents
Due to failure in saliency system, I decide to just put something that I can possibly do in one week or less term. that’s it give different bla..bla.., with images from the last samsung machine. Since this is just a draft. I think I’ll be able to do several changes.

@ writings
Proceed smoothly. I have my contents and now I have imagination of what I should write.
happy!!
done with chapter 1.
And beautifully rendered pdf from latex! really rewarding.

$ ======================================== $
Saturday, September 15th 2007, Gwangju, South Korea.
4:49 AM

@flashnews
sleeping behavior disorder? I slept yesterday from 00 until 12. and now I cannot sleep or pushing myself?

@ writings
Error in indexing image!!!!
I don’t know what’s wrong but I don’t like it wrong!!
memoir error? should I change my template?

And the new memoir is incompatible with subfigure? How I am gonna do that?

And the figure list is very-very wrong if the name is too long. How can they do that to me!

$ ======================================== $
Saturday, September 15th 2007, Gwangju, South Korea.
2:37 PM

@ writings
I’ve found several thesis template on the Internet. What can I say? it knows too much
I want to try the Cambridge template.

The main problem is image indexing!!
Now I know how to create long image caption without breaking the 1 line rule for generating the list!

@ contents
Multi resolution seems to come up several times. From LK flow, with somehow I cannot retrieve it clearly.
But…
Saliency – Human detection and tracking can make use of it !!
so it is really worth it!
I even list it in my ToC….

$ ========================================================================= $
Sunday, September 16th 2007, Gwangju, South Korea.
9:15 PM

@ writings
Damn, chapter three is done
so mush for the draft.

two more chapters to go and I can continue with improvements.

$ ======================================== $
Tuesday, September 18th 2007, Gwangju, South Korea.
5:00 AM

@ flashnews
why do this people wakes up all night lately? if not for this thesis, I will sleep peacefully now

@ writings
So this is not the draft?
What the hell…
since prof said the engineering committee close their eyes. just take advantage of it.

@ contents
Prof want to use another approach. Basically is similar to edgelet except that he want to use Gabor filter response trained on several part of upperbody : half circle of the head and double neck contour. The filter response yield feature vectors. Maybe will be similar to wavelets. Bable whatever you want. Implementation is one important parts of the process.

7:54 AM

Damn, arguments with prof always running in my head
I can’t concentrate.

Basically, I cannot agree with how you handle things. You seems to forget something that in computer vision practice, implementation is the most important. From what I understand, correct me if I wrong, you seems to underestimate time needed to build non-trivial and new approach into practice.

I have observed some famous researcher on computer vision field which give new and novel approach. The range between publication sometimes one year. In one year, they give a good working result and evaluation. Which shows consistency in research. This is what we hope to experience.

$ ======================================== $
Thursday, September 20th 2007, Gwangju, South Korea.
8:04 AM

@ writings
despite many issues. I think I am done with the draft.
my body can’t make it more. my hand can’t even type correctly.

I think flexor muscle is fatigue.
can’t event focus!!

I will continue after one or two weeks
ciao

$ ======================================== $
Thursday, October 08th 2007, Gwangju, South Korea.
3:03 PM

@ flashnews
I am back !!!
back to thesis writings.

Bioshock is done, tried world in conflict, but not enough chemistry between us to make us continue. watch a lot of Korean and Japan drama…. Que sera-sera, Ms. Kim one million dollars, My Husband, etc..


I think I should be ready to restart the whole things and make something!!

@ contents
Over the last week, I am interested to combine generative PCA and discriminative LDA in more principled way, rather than naively use PCA as pre-processing stage. Fidler works which note that truncated basis can severely affect performance and Grabner’s feature-based combination using Haar-boosted is interesting. But I am more interested in trying to fill the blanks. which combination still unnoticed so somehow I can make a original contribution to machine learning. So I try to find :
1. Kernel PCA + LDA (I think this is not exploited although somewhat direct from Fidler’s work)
2. Probabilistic PCA + LDA in linear gaussian (this already mentioned in Ioffe’s work but not evaluated yet)
3. Non Linear probabilistic PCA + LDA. This involves Gaussian processes as non linear kernel. Based on Lawrence’s Gaussian Process Latent Variable Model (GPLVM). The discriminative combination already explored.

My intuition is that by using Partial Hausdorff matching, we can use both generative and discriminative subspace approach (this is shown by Huttenlocher and Felzenszwalb), so combining them should not that hard and if I can add Kernel then it is a PAMI papers.

I might have to hold first for probabilistic settings, since the parameters finding involves optimization which I have to understand first and I think it will not fit to run real-time. Most of it just involve matrix inversion. I still do not have enough understanding to start on this.

+Just now I got the feeling that Felzenszwalb’s work on generalized Hausdorff actually both generative and discriminative. But the properties is still need to be evaluated in principled way. In Huttenlocher’s subspace Hausdorff, the subspace approximation is considered as binary correlation approximation.

@ writings
I want to change the template of thesis into cambridge template. This template, memoir, is suitable for books but not engineering ones.
write the journals….

@ 4:58 PM

Effort to understand machine learning theory takes more that I initially estimated. This is a bit hard.
Bayesian formulation and numerical optimization involves seems daunting for me. In the ends, all these methods involve finding parameters from set of training data.

I am interested particularly in
PCA+LDA
SVM
Boosting.

+ instead of focusing on machine learning, feature representation also can be exploited. But in the end, machine learning analysis of the feature is needed to exploit certain structure rising in the feature.

latent space, Gaussian process, kernels, whatever. these things mixed up in my head.

$ ======================================== $

Advertisements

Senbazuru (Thousand Origami Cranes)

Di Jepang ada kepercayaan jika seseorang membuat 1000 origami burung jenjang (crane)  maka satu permintaannya akan dikabulkan. Sejak tahun lalu ketika terlibat dalam salah satu proyek di Jakarta, saya mulai membuat crane ditengah-tengah kejenuhan coding. Dan hingga sekarang belum mencapai 100. Membuat origami crane sendiri ternyata bisa membuat jenuh juga. Ketika proyek software yang sekarang saya kerjakan mulai terasa membosankan, saya mulai kembali melanjutkan project senbazuru yang sudah berjalan.

Tapi ini bukan tentang senbazuru. Ini tentang “keteraturan lalu lintas” di Kota Bandung yang memang terkenal dengan kemacetannya, jalan yang berlubang dan angkot serta pengendara sepeda motor yang selalu bikin saya “tersenyum”. Beberapa hari yang lalu, dalam perjalanan pulang dari kantor di Surapati Core menuju rumah di Buah Batu, saya terjebak kemacetan selama 3 jam mulai dari perempatan Pahlawan-Suci hingga persimpangan Katamso-Cisokan yang mungkin hanya sekitar 500-600 m. Untuk mengisi waktu, maka saya mulai membuat origami crane dan ketika berhasil keluar dari kemacetan, saya berhasil membuat 12 cranes diselingi dengan makan snack, minum dan menginjak pedal gas untuk maju 0.5-1 m kedepan.

the cranes

12 cranes

Lalu, kenapa kemacetan bisa terjadi ? Menurut saya, satu hal yang cukup mengherankan adalah kenapa pengemudi mobil suka menambah jalur dari 1 menjadi 2. Ini yang sepertinya menyebabkan kemacetan yang cukup parah karena ada bottleneck ketika dengan terpaksa jalur harus kembali menjadi satu. Tentunya juga mobil-mobil yang keluar dari Griya dan yang masuk dari terusan Katamso dan Cisokan yang saat itu volumenya cukup banyak apalagi yang masuk dari Pahlawan dan Mustofa.  Saat itu saya cukup yakin dengan hipotesa penambahan jalur ini lah yang menjadi biang nya sehingga saya memikirkan jika saya bisa mendesain simulasi untuk membuktikannya maka akan saya coba. Namun hingga saat ini belum sempat untuk dilakukan. Mungkin faktor lainnya adalah meningkatnya volume kendaraan yang entah bagaimana bisa terjadi pada sore itu. Jadi pelajarannya adalah jangan pernah melewati jalan dimana para pengemudi mobil bisa menjadi cukup “cerdas” untuk menambah jalur.

Ketika teman bertanya dikantor pada siang hari kenapa saya membawa-bawa kertas origami. Saya katakan untuk mengisi waktu seperti di jalan ketika macet. Jadi berpikir, mungkin kita harus berhati-hati dengan apa yang akan kita katakan.

Sisi positif nya adalah target origami yang harus saya capai berkurang 12 pada sore itu. Sejak saat itu, ketertiban lalu lintas di Bandung masuk dalam wish list saya yang mungkin akan saya ucapkan ketika saya berhasil menyelesaikan crane yang ke 1000. My community service :p

Tagged ,