[oharasteve enters the chat - 23:00:16]
(23:00:52) [oharasteve] : Steve Test on Tuesday
[AI_user enters the chat - 13:33:24]
(13:34:21) [AI_user] : testing chatroom
[ssiju enters the chat - 15:18:52]
[bylander enters the chat - 16:45:59]
(16:46:30) [bylander] : This is only a test.
[Kwan enters the chat - 17:45:23]
[vv enters the chat - 18:33:35]
[TonyC enters the chat - 18:43:05]
(18:43:38) [TonyC] : Tony is in the chatroom and testing.
[bylander enters the chat - 18:52:32]
(18:53:03) [bylander] : Good evening, students
(18:55:38) [vv] : good evening, sir
(18:56:56) [bylander] : I hadn't thought of it until now
[ssiju enters the chat - 18:56:56]
(18:57:11) [bylander] : but I could have modified the software to display
(18:57:32) [bylander] : a webpage of my (or your choice)
[oharasteve enters the chat - 18:58:00]
(18:58:05) [bylander] : That way I could have shown the slides instead of asking you to make copies.
[Kwan enters the chat - 18:58:46]
(18:59:02) [oharasteve] : Steve is present and accounted for ...
(18:59:30) [bylander] : I guess before we start, are there any questions?
(18:59:47) [TonyC] : None from me...
(19:00:39) [bylander] : To make this go more smoothly, all my lines will be in black.
(19:01:02) [TonyC] : Then I will switch.
(19:01:04) [bylander] : If you have any comments or questions, please use a readable color.
[vv enters the chat - 19:01:21]
(19:01:22) [bylander] : First of all, some comments on HW9
(19:01:44) [bylander] : My Push op was Push(b,x,y,r)
(19:02:00) [bylander] : meaning Push box b from location x to location y in room r
[TonyC enters the chat - 19:02:19]
(19:02:28) [bylander] : In some hws, the preconds and effects had variables that were not parameters
(19:03:09) [bylander] : My precond for Push was Locatoin(x,r) & location(y,r) & box(b,x) & at(x) & x != y & on(floor)
(19:03:22) [bylander] : The key is to get both locations in the same room somehow
(19:03:33) [bylander] : Some students had a sameroom predicate
(19:03:58) [bylander] : The effect needs to delete at(x) and box(b,x) and add at(y) and box(b,y)
(19:04:11) [TonyC] : Pardon; but is there some way to keep the screen from returning to the top of the chat every ten seconds? I keep having to scroll down to see what is being said....
(19:04:20) [bylander] : My initial state needed a lot of location(x,r) literals,
(19:04:28) [bylander] : meaning location x is in room r
(19:04:43) [vv] : me 2
(19:04:59) [bylander] : My answers are on WebCT at your leisure
(19:05:20) [bylander] : The chat screen is supposed to stick to the bottom with an HTML anchor.
(19:05:44) [bylander] : Wait a minute and I can try to fix it
(19:06:01) [oharasteve] : I also have that problem
(19:06:58) [bylander] : Ok, maybe I fixed it
(19:07:10) [bylander] : It will only display the last 15 messages.
(19:07:11) [oharasteve] : Scrolled to the middle now
(19:07:25) [bylander] : Well, that will be better I suppose.
(19:07:29) [oharasteve] : Yes, much better ... thank you
(19:07:39) [TonyC] : It looks good on my IE, but there is no scroll bar. That is at least better, though.
(19:07:53) [bylander] : Any questions on Lab 3?
(19:07:59) [ssiju] : yeah, it is good now
(19:08:15) [Kwan] : I have a question.
(19:08:20) [vv] : good now
(19:08:35) [bylander] : Go ahead, Kwan
(19:09:12) [bylander] : Changed to display the last 20 messages.
(19:09:30) [Kwan] : If we use the first 50,000 cases to train and the last 10,000 to test, how is this going to be passed off? Do we just need to talk about our results in the write up?
(19:10:13) [bylander] : Actually, each additional case is a test as it hasn't been seen before.
(19:10:33) [bylander] : But focusing on the last 10,000 get a good idea of the final error rate.
(19:10:46) [bylander] : The total error rate should be less than 20%.
(19:10:59) [bylander] : Im sure some of you are doing much better than that.
(19:11:19) [Kwan] : Meaning the last 10,000 won't cause additional learning?
(19:11:29) [oharasteve] : Human error rate for me was 13%
(19:11:57) [bylander] : It will still be learning on the last 10,000 which is ok.
(19:12:14) [bylander] : How many examples, oharasteve?
(19:12:21) [Kwan] : OK. Thanks.
(19:12:33) [oharasteve] : 100
(19:13:07) [bylander] : Anybody try NLTK yet?
(19:13:34) [oharasteve] : Not yet .... this weekend for sure
(19:13:51) [TonyC] : (Stony silence)
(19:13:57) [bylander] : Ok, let me know ASAP if there are permissions problem.
(19:14:04) [Kwan] : Nope
(19:14:25) [bylander] : Ok, let's go to vision, should be a challenge to do this text only.
(19:15:02) [bylander] : Vision is sometimes stated as the problem of getting information from 2D images.
[jpatil enters the chat - 19:15:06]
(19:15:09) [oharasteve] : Ironic it seems :)
(19:15:29) [bylander] : The images might be from different times and/or different locations.
(19:15:55) [bylander] : Eg, consider when you are walking around, you get different images from differenet spots
(19:16:18) [bylander] : Page 2 and 3 of the handout provide some intro stuff,
(19:16:31) [bylander] : then a list of visual processing steps.
(19:16:52) [bylander] : First, the image. Then, feature extraction (edge detection, segmentation)
(19:17:17) [bylander] : then shape recovery using stereo, motion, texture, shading, etc.
(19:17:30) [bylander] : Finally, object recognition.
(19:17:48) [bylander] : Today I hope to get though the intro and edge detection.
(19:18:08) [bylander] : Page 4 shows the pinhole camera model and page 5 some properties of that model
(19:18:36) [TonyC] : I don't see anything on segmentation.
(19:19:02) [bylander] : I just mention it in the list, segmentation is ignored thereafter
(19:19:03) [oharasteve] : Slide Vision - 3 mentions Segmentation
(19:19:33) [bylander] : Anyway light goes from the world (say point (X,Y,Z))
(19:19:51) [TonyC] : Oops. I was reading my page numbers instead of yours...
(19:19:58) [bylander] : thru the pinhole (0,0,0) and then the image plane (x,y,-f)
(19:20:14) [bylander] : Since this is a line that goes thru the origin,
(19:20:40) [bylander] : we have the equation on page 5 (both Z/Z and -f/-f are equal to 1,
(19:21:01) [bylander] : so X/Y = x/-f and Y/Z = y/-f
(19:21:22) [bylander] : You can see the solution at the bottom of page 5.
(19:21:46) [bylander] : This is a long-winded way of saying that 3D points map to a
(19:22:00) [vv] : (x,y,-f) or (-x*f/z, -y*f/z, -z) ?
(19:22:08) [bylander] : 2D point in the image, so there is plenty of room for ambiguity.
(19:22:27) [bylander] : We will see later (prob. Mon.) that this ambiguity can be mostly
(19:22:36) [bylander] : recovered via multiple images.
(19:23:05) [bylander] : Page 6 shows the exciting image of a mysterious object.
(19:23:26) [oharasteve] : But that assumes that either the camera or the object is moving or movable
(19:23:36) [bylander] : There is a square highlighted in the upper right corner, which
(19:24:03) [TonyC] : It might just imply multiple cameras.
(19:24:11) [bylander] : oharasteve, we move around all the time, shift positions,
(19:24:13) [oharasteve] : Yes
(19:24:37) [bylander] : fidgit, etc. Also we have stereo vision.
(19:25:15) [bylander] : Also, human eyes don't lock in a single position unless you are in a staring contest.
(19:25:25) [oharasteve] : No problem for humans, but there are plenty of staionary video cameras
(19:25:31) [bylander] : Your eyes are constantly rolling around.
(19:25:54) [bylander] : The eyes also get feedback from adjusting the focus.
(19:26:40) [bylander] : Though I am at an age where my glasses need to compensate
(19:26:49) [bylander] : for decreasing focussing ability.
(19:27:27) [bylander] : For stationary cameras, you need 2, maybe 3 to get
(19:27:35) [bylander] : good shape recovery.
(19:27:58) [bylander] : Anyway, maybe I can go to page 7, where a 12x12 section
(19:28:15) [bylander] : of the UFO is shown, along with the grayscale values.
(19:28:33) [TonyC] : Why do we need multiple cameras to do that? I can do it from a photograph with one eye shut.
(19:28:51) [bylander] : Looking the 12x12 image, there is a big change you can see
(19:29:32) [bylander] : TonyC, you have a good model of the world.
(19:29:45) [bylander] : Many visual illusions are based on fooling that model.
(19:30:29) [bylander] : Visual processing in machines try to make up poorer world model
(19:30:47) [bylander] : (as compared to humans) with more processing at earlier stages.
(19:31:06) [vv] : i used a 3D camera before, it just have one lens, but need take about 3 pictures to build a 3D-modle
(19:31:13) [TonyC] : True enough, but illusions aside, I guess I am processing shadow and light or something.
(19:31:50) [bylander] : You perhaps are remembering the 3D structure of the scene
(19:32:06) [bylander] : or a similar scene with substitutions.
(19:32:39) [bylander] : As an analogy, your argument might be that you are
(19:32:55) [bylander] : processing English using something.
(19:33:12) [bylander] : when you use natural language.
(19:33:28) [bylander] : However, when AI has tried to accomplish tasks like
(19:33:47) [bylander] : vision or NL, they turn out to be amaziningly difficult.
(19:34:14) [TonyC] : Well, I mean I see the shadow of a bump and by the holistic effect of determining the light source in the image, I know the dark part is a shadow and how high the bump must be.
(19:34:50) [bylander] : TonyC, you have orders of magnitude better visual
(19:35:01) [bylander] : processing than what machines can do.
(19:35:47) [bylander] : We can only cover a little of what is known to work a little bit.
(19:36:31) [TonyC] : Okay. I guess I am thinking of just what you were talking about in class Monday, we zero in on the model that makes as much sense of the image as possible, and I understand why machines cannot do that.
(19:37:01) [bylander] : There is a lot of memory that we able to use in NL
(19:37:13) [bylander] : or vision or learning that we don't know very well
(19:37:21) [bylander] : how to use yet in machines.
(19:38:00) [oharasteve] : It can still be useful, even if far from human-quality interpretation
(19:38:05) [bylander] : Back to the stapler square, a big diff. is from the label on top.
(19:38:35) [bylander] : While the orientation difference produces less of an effect.
(19:39:21) [bylander] : This leads to edge detection, in which we hope
(19:39:37) [bylander] : discontinuties in light or color correspond to
(19:39:46) [vv] : i have a question, the stapler square looks different with the picture.
(19:39:53) [bylander] : discontinuties in shape and depth and objects.
(19:40:11) [bylander] : vv, different in what way?
(19:40:41) [TonyC] : I think it is just the one tiny square blown up, not the whole picture. I thought it was the whole picture at first, too.
(19:40:48) [vv] : both desk and the wall and the staple looks like write
(19:41:08) [bylander] : Yes, see the little square in the upper right of page 6.
(19:41:21) [vv] : white i mean
(19:41:23) [oharasteve] : Human vision failure :)
(19:41:56) [bylander] : The 12x12 on page 7 just a small part of page 6
(19:42:26) [vv] : ok, ic, i misunderstanding
(19:42:43) [bylander] : Anyway, the main difference in the 12x12 is not due to shape,
(19:43:02) [bylander] : so edge detection is used to produce features
(19:43:19) [bylander] : that later processing steps will work on to add shape and depth.
(19:43:51) [bylander] : The steps of edge processing are illustrated on page 9.
(19:44:17) [bylander] : I took the fourth row of the 12x12: 167, 164, 171, ...
(19:44:44) [bylander] : and displayed that as the solid line on page 9.
(19:45:14) [bylander] : The smoothed line follows the raw line, but is less jagged.
(19:45:32) [bylander] : It accomplishes something noticable even though
(19:45:41) [bylander] : it only averages two of the points.
(19:46:16) [bylander] : The diff line which seems to start at 200, then goes up to 230,
(19:46:44) [bylander] : back to 190, should be measured using the scale on the right of page 9
(19:47:01) [bylander] : The -10 to 30 scale on the right. You can see the peak
(19:47:25) [bylander] : of the diff line where the raw and smooth lines have a large slope.
(19:47:52) [bylander] : The peaks (high and low) are mapped to one part of an edge.
(19:48:11) [bylander] : Any confusion about the graph.
(19:48:48) [bylander] : This shows on one dimension what we want to do on 2 dimensions.
(19:48:53) [TonyC] : No. Is there some theoretical reason for average 2 points? Could we use a binomial window weight, like 1,2,1 or 1,4,6,4,1?
(19:49:28) [bylander] : TonyC, if averaging two points hadn't made a big
(19:49:44) [bylander] : enough difference, I would have tried something fancier.
(19:50:02) [bylander] : Maybe I chose the 4th row because it looked good
(19:50:14) [TonyC] : Okay.
(19:50:28) [oharasteve] : There is an art to deciding how to pre-process images to extract good features
(19:50:30) [bylander] : Sometimes a large number of points are averaged (or blurred)
(19:50:54) [bylander] : in order to obtain large-scale features.
(19:51:16) [bylander] : And to reduce some noise at the cost of detail.
(19:51:50) [bylander] : I will go to the Sobel operators on page 11 before trying to explain page 10.
(19:52:15) [bylander] : The Sobel operators combine smoothing and differentation in one step.
(19:52:35) [bylander] : The idea is that larger magnitudes correspond to edges.
(19:53:05) [bylander] : Let's take the vertical operator and apply it to the upper left corner of page 7
(19:53:46) [bylander] : The operator will multiply -1 -2 and -1 times the first three values in the first column.
(19:54:14) [bylander] : -1 * 195 + (-2)*210 + (-1)*164
(19:54:38) [bylander] : The 1, 2 and 1 are multiplied times the first three values of the
(19:54:46) [bylander] : third column.
(19:55:09) [bylander] : 1*221 + 2*249 + 1*180
(19:55:27) [bylander] : All of this added up comes to a positive number which
(19:55:51) [bylander] : indicates going from darker to lighter as you go from left to right.
(19:56:15) [bylander] : The horiz operator is similar but to the first three values of the first and third row.
(19:56:39) [TonyC] : Do we assign that value to the center of the square, and do that for all squares?
(19:57:00) [oharasteve] : There are two values .... do we add them?
(19:57:03) [bylander] : 1*195 + 2*209 + 1*221 + (-1)*164 + (-2)*172 + (-1)*180
(19:57:27) [bylander] : Yes that value is assigned to the center of the square.
(19:57:48) [bylander] : WIth the two values, you apply the arc tangent operator
(19:58:02) [TonyC] : Do we have operators to find 45 degree edges?
(19:58:10) [bylander] : and also determine the magnitude of the points:
(19:58:31) [bylander] : sqrt(v*v + h*h) where v and h are the two values
(19:58:55) [bylander] : TonyC, you can add more operators but this is sufficient for any angle
(19:59:31) [bylander] : Page 12 illustrates the direction and magnitude of the (h,v) values
(19:59:45) [bylander] : from the Sobel operators.
(20:00:02) [TonyC] : Oh. Hey, that's pretty cool!
(20:00:15) [bylander] : You can see larger length lines where the it goes from gray to white in the upper part
(20:00:20) [oharasteve] : Yes, that is slick!
(20:00:37) [oharasteve] : Edward Tufte would be proud
(20:00:43) [bylander] : You can see shorter but consistent lines on the lower half corresponding to the
(20:00:52) [TonyC] : Down below you can see the secondary edge from the corner of the stapler.
(20:00:52) [bylander] : orientation change.
(20:01:47) [TonyC] : Do we try to smooth these angles to make a clean edge?
(20:02:11) [bylander] : The single pixel edges need to be connected to obtain a line edge.
(20:02:35) [bylander] : I did not include the Figure 24.5b from the book,
(20:03:01) [bylander] : but that shows the edges for the whole stapler picture.
(20:03:03) [TonyC] : How do you determine a threshold of change? Is that a statistical choice based on the variation you detect, like one standard deviation or something?
(20:03:35) [bylander] : The book suggests a threshold, but seems too naive to me.
(20:03:45) [oharasteve] : Think "art" not "science" ... it will vary by application
(20:04:00) [bylander] : A statistical method sounds like a good choice.
(20:04:22) [bylander] : I should mention vision is not my area, so don't count me as too much of an expert.
(20:04:32) [oharasteve] : Looking for a tank in the desert is different than address on an envelop
(20:04:48) [bylander] : At this point, the edges are simply features to be used for
(20:05:13) [bylander] : later processing. We can get shape by matching features
(20:05:23) [bylander] : from different images.
(20:05:37) [TonyC] : I could see doing this at different scales, and using different thresholds based on what you expect from the coarser scale. Anyway, moving on...
(20:06:02) [bylander] : Now the Sobel operators are simplified version of the general
(20:06:16) [bylander] : idea of convolution on page 10.
(20:06:56) [bylander] : G is a gaussian function. The function exp(-x*x) has values from x = -3 to x=3 of
(20:07:28) [bylander] : .0001, .0183, .3679, 1, .3679, .0183, .0001
(20:07:55) [bylander] : The Sobel operators simply map the .3679 to 1 and the 1 to 2 and ignore the rest.
(20:08:36) [bylander] : G' is the derivative. The function -2*x*exp(-x*x) has values from x=-3 to x=3 of
(20:09:15) [bylander] : .0007, .0733, .7358, 0, -.7358, -.0733, -.0007
(20:09:46) [bylander] : The Sobel operators map .7358, 0, -.7358 to 1,0,1 and ignore the rest.
(20:10:34) [oharasteve] : Sobel probably works in a lot of cases, seems like a reasonable approximation
(20:10:36) [TonyC] : Do you mean 1, 0, -1?
(20:11:00) [bylander] : That is, the Gaussian starts from 0 rises to 1 (Pos deriv.), has a max at 1 (0 deriv) and then decreases to 0 (neg deriv).
(20:11:13) [bylander] : TonyC, you are right it should be 1,0,-1.
(20:12:16) [TonyC] : Right, the binomial coefficients are an approximation of the Gaussian curve. That is what I was talking about with 1,4,6,4,1.
(20:12:23) [bylander] : The V(x,y) function with these simplified values would result in the vertical Sobel operator
(20:12:46) [bylander] : applied to point x,y on the image.
(20:14:07) [bylander] : In any case, the vertical Sobel operators 9 values are obtained by:
(20:14:39) [bylander] : -1*1, 0*1, 1*1 on the top row
(20:15:00) [bylander] : -1*2, 0*2, 1*2 on the second row
(20:15:25) [bylander] : and -1*1, 0*1, 1*1 on the third row
(20:16:00) [bylander] : The derivative value change across columns,
(20:16:13) [bylander] : the gaussin values change across rows.
(20:17:00) [bylander] : Page 10 (taken from book) suggests that E(x,y)
(20:17:14) [bylander] : whether there is an edge or not is by a threshold by
(20:17:22) [bylander] : the size of the convolution values.
(20:17:36) [bylander] : The orientation is done by the arctangent function.
(20:17:55) [bylander] : Any confusion about page 10?
(20:18:23) [bylander] : If you understand the Sobel ops, that is good enough for me.
(20:18:23) [oharasteve] : Quite clever it seems
(20:18:56) [TonyC] : No. I guess we could use different functions for the magnitude, however.
(20:18:58) [bylander] : It seems like a good way to get images, but any
(20:19:14) [bylander] : complex scene (like the mess on my desk)
(20:19:23) [bylander] : there will be a large number of edges.
(20:19:59) [oharasteve] : Convolution is needed for face recognition, etc I think
(20:20:00) [bylander] : This has been more tiring than I thought it would be.
(20:20:10) [bylander] : I am not used to typing on demand.
(20:20:25) [TonyC] : Well, it might be useful to look for large scale edges at a coarse value, and then search the sub-spaces recursively...
(20:20:40) [oharasteve] : I am very appreciative of the effort
(20:20:45) [TonyC] : I meant coarse resolution...
(20:21:18) [bylander] : I will have the transcipt with all the typing mistakes on my web site.
(20:21:21) [oharasteve] : Saves me many hours of driving
(20:21:29) [bylander] : There is a chance it might be useful.
(20:21:55) [oharasteve] : Thank you
(20:22:08) [Kwan] : Thanks
(20:22:10) [bylander] : Any feedback on whether this a good or reasonable to do thing
(20:22:21) [bylander] : would be good to know.
(20:22:22) [oharasteve] : I hope to take the Machine Vision class in the spring
(20:22:35) [bylander] : I think I could add a picture element so I could show the
(20:22:44) [TonyC] : It seems like we could have covered it faster in person. We didn't get much done, do you think?
(20:22:52) [bylander] : slides, but it would be interesting to try to manage that
(20:23:01) [bylander] : at the same time I am trying to speed type.l
(20:23:17) [oharasteve] : Paper is no problem at all
(20:23:39) [oharasteve] : More useful to have a whiteboard or live video
(20:23:49) [TonyC] : Agreed, paper is not a problem. Plus I get to eat dinner.
(20:23:57) [oharasteve] : Just showing slides on the screen only wastes screen space (for me)
(20:25:04) [bylander] : That is all from me. I'll be here to answer questions.
(20:25:12) [Kwan] : I think a webcast would be great. I've used a system which allows a presenter to put stuff up for everyone to see and listen to and then the rest of the group can chat.
(20:25:15) [TonyC] : I think even one-way speech from you to us would have let us do more, and saved you the typing.
(20:25:26) [oharasteve] : Happy Thanksgiving to everybody :)
(20:25:38) [vv] : yes, i agree tonyc
(20:26:11) [bylander] : I agree TonyC, but do you know of software which will work over diff OSes that we can use without costing too much?
(20:26:42) [vv] : MSN
(20:26:49) [TonyC] : We could still type our questions, I think. I don't know software, but several sites seem to manage it.
[bylander enters the chat - 20:26:52]
(20:28:02) [bylander] : I'll look into it, TonyC. Maybe some sort of streaming software might do the trick.
(20:28:20) [TonyC] : I mean a lot of live radio stations on the net do it, I don't know if they have to pay for it.
(20:28:33) [bylander] : Ack, I had to log back in, and it changed me to green.
(20:28:42) [Kwan] : I've used WebEx and that worked great.
(20:30:06) [TonyC] : I will ask around. A friend of mine might know of something.
(20:30:42) [TonyC] : All right, I'm signing off. See you all Monday.
[bylander enters the chat - 20:30:46]
(20:31:30) [ssiju] : me too. Happy Thanksgiving to everyone.