Copy Link
Add to Bookmark
Report
VISION-LIST Digest Volume 10 Issue 48
VISION-LIST Digest Tue Nov 12 12:54:34 PDT 91 Volume 10 : Issue 48
- Send submissions to Vision-List@ADS.COM
- Vision List Digest available via COMP.AI.VISION newsgroup
- If you don't have access to COMP.AI.VISION, request list
membership to Vision-List-Request@ADS.COM
- Access Vision List Archives via anonymous ftp to ADS.COM
Today's Topics:
The Brodatz saga continues....
Re: Fast Hadamard Transform
Re: corner detection
Re: corners
Code for shape descriptors?
2D FFTs and polar FFT
Handwritten character recognition (ref. request)
Fuzzy Logic evaluator/tutorial
Post-Doc position in CMU Vision / Parallel Processing Group
Applications sought for position
Kahaner Report: 1st Korea-Japan Conf.on Comp. Vision, Oct '91 Seoul
----------------------------------------------------------------------
Date: Fri, 8 Nov 91 7:39:05 EST
From: Stanley Dunn <smd@occlusal.rutgers.edu>
Subject: The Brodatz saga continues....
There have been a couple of messages recently noting that people
had been unable to connect to incisal.rutgers.edu; some in fact from
Rutgers! I discovered that periodically the internet daemon stops
running; this eliminates people from remotely loggin in.
My apologies to those of you who have had difficulty. If you cannot
connect, write to me (smd@occlusal.rutgers.edu) and I can check
incisal. I have a couple of students trying to use incisal for a graphics
project and when the code is being debugged, inetd dies. We should
be able to fix it soon.
Azriel always maintained I could never program......... :-)
Stan Dunn
smd@occlusal.rutgers.edu
------------------------------
Date: 5 Nov 91 05:59:37 GMT
From: decwrl!apple!well.sf.ca.us!well!tauzero@uunet.uu.net (Tau Zero)
Subject: Re: Fast Hadamard Transform
Vision-List@ADS.COM writes:
>Does anybody know of the existance of a fast hadamard transform,
>preferably in the form of source code? If not, would you know about a
>SLOW hadamard transform?
>The Hadamard transform is like the Fourier transform except that its
>basis function is a SQUARE wave, rather than a SINE wave.
See pages 115-120 of "Digital Image Processing", by R. Gonzalez and P. Wintz,
2nd edition - Addison-Wesley 1987 ISBN 0-201-11026-1.
Page 115 has the source code for a fast Walsh transform; reordering the results
gives the Hadamard transform.
There is an extensive literature on such square wave functions; Gonzalez
and Wintz provides a convenient entry point for the alert student!
------------------------------
Date: 05 Nov 91 11:35:21+0200
From: Monika Sester <monika@ifp.bauingenieure.uni-stuttgart.dbp.de>
Subject: Re: corner detection
Foerstner designed a powerful operator that is able to detect and localize
precisely distinct points, corners and centres of circular features.
LITERATURE:
A Fast Operator for Detection and Precise Location of Distinct Points,
Corners and Centres of Circular Features
by W. Foerstner and E. Guelch
ISPRS (Intern. Society of Photogrammetry and Remote Sensing) Intercomm.
Workshop, Interlaken, June 1987
if you don't have access to this paper, i could send a copy.
Monika Sester, Stuttgart
monika@ifp.bauingenieure.uni-stuttgart.dbp.de
------------------------------
Date: Tue, 05 Nov 91 23:18:03 +0000
From: Keith Langley <ucjtskl@ucl.ac.uk>
Subject: Re: corners
There are additional techniques for corner detection
based upon Gabor filters (needs improving but works reasonably well)
published in the BMVC conference in Glasgow U.K 1991.
------------------------------
From: Raul Valdes-Perez <valdes@CARMEN.KBS.CS.CMU.EDU>
Date: Wed, 6 Nov 91 10:46:43 EST
Subject: code for shape descriptors?
Greetings,
I am beginning a new project in cell biology for which the starting
point is to compute a number of measures of cell properties. I have
in mind using at first some generic shape descriptors from vision.
Ballard & Brown discuss simple 2-D shape descriptors in their book,
e.g., area, eccentricity, compactness, lobedness, shape numbers, etc.
Can anyone point me to code that computes these or other 2-D shape
descriptors?
Raul E. Valdes-Perez valdes@cs.cmu.edu
School of Computer Science and
Center for Light Microscope Imaging and Biotechnology
Carnegie Mellon University
------------------------------
Date: 6 Nov 91 22:35 -0800
From: Esfandiar Bandari <bandari@cs.ubc.ca>
Subject: 2D FFTs and polar FFT
I am looking for very fast C routines that conduct FFTs (and
inverse FFTs) on two dimensional float and complex (float) valued
images. Peresently I am using the N-dimensional routines that comes
with Numeriacl Recepies in C.
I am also very much interested in routines that take an image
and produce its polar fourier transform.
Lastly, since I sometimes I use these routines itteratively,
it would be nice if a version of these routines would initialize a
trig look up table and use them in subsequently. Any routines,
pointers or information leading to the source code would be greatly
appreciated.
--- Esfandiar
[Blindingly fast FFT code would be a good addition to the Vision Archive.
phil... ]
------------------------------
Date: Fri, 8 Nov 1991 05:52:46 GMT
From: laviers@iro.umontreal.ca (Sebastien Lavier)
Organization: Universite de Montreal
Subject: Handwritten character recognition (ref. request)
Hi,
I'm deeply sorry to occupy this bandwith but...
I'm looking for references of articles/books on
handwritting recognition (alphabet)
I'm new on the subject (It's for a graduate 'AI' course)
so.. nothing's too simple for me...
TRY TO INSULT MY INTELLIGENCE! 8-)
I'm aware of the 2 articles written by J.R. Ward and B. Blesser
in 1985 (published in IEEE, Proc. Trends and <Applications and in
IEEE C,G&A)
I also have a copy of a Ph D thesis from Marc Berthod
from 1982 (Univ. de Paris VI) "Une methode syntaxique
de reconnaissance des caracteres manuscrits en temps reel
avec un apprentissage coninu"
It gives a syntaxical method for handwriten character recognition.
Is there ('probably is...) anything interesting SINCE 1985-86?
I'm wondering about other methods (non-syntaxical?)
I'm just wondering if someone has something really great
that I can't let myself overlook.
Thank you very much
|Sebastien Lavier [Calimero] | Quote for the day:
|laviers@iro.umontreal.ca | "The programmer's national
|Universite de Montreal, Canada | anthem is 'AAAAAAAAAHHHH!'."
|Centre de recherche, Hopital du Sacre-Coeur | -Weinberg, p.152
|Mot du jour: "C'est vraiment pas juste." |
------------------------------
Date: Sun, 10 Nov 1991 11:35:34 GMT
From: ron@monu6.cc.monash.edu.au (Ron Van Schyndel)
Organization: Caulfield Campus, Monash University, Melb., Australia.
Subject: Fuzzy Logic evaluator/tutorial
Keywords: fuzzy logic, logic, prolog
Is there a PD fuzzy logic simulator or tutorial about
I am paticulrly interested in programs that demonstrate the concepts in
a tutorial fashion. The specific purpose of this is to show various
means of image classification, and how LISP/PROLOG or an expert system
may be able to infer simple classes from a binary or gray-level image.
Thanks in advance, RON
Ron van Schyndel, Physics Dept. ron@monu6.cc.monash.edu.au
Monash University (Caulfield Campus) ron%monu6.cc.monash.edu.au@uunet.UU.NET
CAULFIELD EAST, Victoria, AUSTRALIA {hplabs,mcvax,uunet,ukc}!munnari!monu6..
------------------------------
From: webb+@CS.CMU.EDU (Jon Webb)
Subject: Post-Doc position in CMU Vision / Parallel Processing Group
Nntp-Posting-Host: duck.warp.cs.cmu.edu
Lines: 16
Project Scientist (post-doc) sought for research in parallel processing
and computer vision. PhD required. The research involves the
development of architecture-independent programming systems for parallel
image processing and computer vision. The successful candidate will work
as a member of the Vision and Autonomous Systems group in the School of
Computer Science and the Robotics Institute at Carnegie Mellon
University, one of the world's largest and most distinguished vision
research centers. Two-year appointment beginning as soon as convenient
for an outstanding candidate. Send curriculum vitae and references to
Jon A. Webb, School of Computer Science, Carnegie Mellon University,
5000 Forbes Avenue, Pittsbugh, PA 15213-3890, or use electronic mail
address: Jon_Webb@cs.cmu.edu.
Carnegie Mellon is an equal employment opportunity/affirmative action employer.
J
------------------------------
Date: Tue, 5 Nov 91 00:17:08 GMT
From: martin@eola.cs.ucf.edu (Glenn Martin)
Subject: Applications sought for position
UNIVERSITY OF CENTRAL FLORIDA
Orlando, Florida
Computer Science Department
The University of Central Florida seeks applications for two tenure
track positions in Computer Science. Both of these will be at the
level of Assistant Professor. We are interested in all strong
candidates who have demonstrated research strength in computer vision and
artificial intelligence.
Within the area of Computer Vision, we are
particularly interested in those whose research includes either
computer graphics, or medical imaging.
Within the area of artificial intelligence, we are interested
in those whose research includes natural language understanding,
knowledge representation, and knowledge acquisition.
We are a young, dynamic university with a student population that is
about 22,000. The Computer Science Department is one of the
largest on campus, offering the Bachelor's, Master's and Ph.D. degrees.
The faculty research interests include parallel computation,
VLSI, artificial intelligence, computer vision, networking technology,
graphics and simulation, and design and analysis of algorithms.
Currently, the department has small but active research group both in
computer vision and artificial intelligence areas.
Candidates will be expected to strengthen these research groups.
The university is located in Orlando, the center of Florida's strong
software development industry. Its campus is adjacent to the Central
Florida Research Park which houses the Naval Training Systems Center,
the Army's Project Manager for Training Devices, and several
University research organizations including the institute for
Simulation and Training, and the Center for Research in Electro-Optics
and Lasers. Computer Science faculty work closely with, and receive
substantial research support from these groups and from the NASA
Kennedy Space Center which is located within 50 miles of the campus.
Central Florida affords an excellent standard of living.
Orlando ranks among the ten most livable cities in the USA and has
variety of attractions and restaurants.
We have a
strong public school system, easy access to the beaches and a climate
that makes it possible to enjoy the outdoors all year long.
Applications are invited through February 15, 1992. Interested,
qualified applicants should send resumes and names of at least three
references to: Dr. Terry J. Frederick, Chair, Department of Computer
Science, University of Central Florida, Orlando, FL 32816-0362. TEL:
(407) 823-2341, FAX: (407) 823-5419, Email: fred@cs.ucf.edu.
An Equal Employment Opportunity (MIF) Affirmative Action Employer.
------------------------------
Date: 11 Nov 91 03:15:42 GMT
From: rick@cs.arizona.edu (Rick Schlichting)
Subject: Kahaner Report: 1st Korea-Japan Conf.on Comp. Vision, Oct '91 Seoul
[Dr. David Kahaner is a numerical analyst on sabbatical to the
Office of Naval Research-Asia (ONR Asia) in Tokyo from NIST. The
following is the professional opinion of David Kahaner and in no
way has the blessing of the US Government or any agency of it. All
information is dated and of limited life time. This disclaimer should
be noted on ANY attribution.]
[Copies of previous reports written by Kahaner can be obtained from
host cs.arizona.edu using anonymous FTP.]
From: David K. Kahaner, ONR Asia [kahaner@xroads.cc.u-tokyo.ac.jp]
Re: First Korea-Japan Conference on Computer Vision, 10-11 Oct '91 Seoul
8 Nov 1991
This file is named "kj-cv.91"
ABSTRACT. A summary and assessment of the First Korea-Japan Conference on
Computer Vision, held 10-11 Oct 1991 in Seoul Korea is presented.
INTRODUCTION AND SUMMARY
The "First Korea-Japan Conference on Computer Vision" was held 10-11 Oct
1991 in Seoul's KOEX Conference Center. Approximately 80 papers were
presented (31 from Korea, 47 from Japan) in three parallel sessions
during the two days. (KOEX, a massive convention facility, could easily
have swallowed 50 such conferences.) Two invited papers summarized
computer vision research in Korea and Japan respectively, and a panel
discussed "Application of Computer Vision for Automation". There were
about 200 attendees. As few Japanese or Koreans know each others'
language, the conference language was English. Korean scientists are
very far behind in research, development and industrial applications of
computer vision. Both countries see analysis of moving images as the
most important new research area, although many "standard" topics are
far from solved. (The recent interest in movement stems from increasing
processing speed. Without fast computers and fast data movement
researchers had to sample images every second or two, leading to very
complicated relations between successive frames.)
Conference chairmanship was shared between
Professor Chung-Nim Lee
Basic Science Research Center
Pohang Institute of Science and Technology
PO Box 125, Pohang, Kyung Buk, 790-330 Korea
Tel: +82-42-562-79-2041, Fax: +82-42-562-79-2799
Email: CNLEE@VISION.POSTECH.AC.KR
and
Professor Masahiko Yachida
Faculty of Engineering Science
Dept of Information and Computer Sciences
Osaka University
1-1 Machikaneyama-cho
Toyonaka-shi, Osaka 560
Tel: +81-6-844-1151 ext 4846, Fax: +81-6-845-1014
Email: YACHIDA@ICS.OSAKA-U.AC.JP
I had met Professor Lee almost twenty years earlier when we were both in
the Mathematics Department at Ann Arbor. He was trained as a pure
mathematician who is now interested in applied problems, particularly
those concerned with computer vision.
The fairest things that can be said about this conference are that (1)
the Korean scientists were very brave to have organized it, especially to
have scheduled more or less alternating talks describing Japanese and
Korean research, and (2) the Japanese scientists were very gracious to
have participated so enthusiastically. A preview of what was going to
happen was given at the opening lectures when research in the two
countries was reviewed. A literature search on computer vision uncovered
fewer than one tenth as many Korean as Japanese research papers.
Although the paper balance at the conference was much more even, it was
very clear that Korean research is at a much earlier stage, and that
applications of vision in industry are much more limited than in Japan.
There were a (very) few exceptions. For example, Prof T.Poston in
POSTECH's mathematics department [Email: TIM@VISION.POSTECH.AC.KR] gave
an elegant discussion of the use of singularity theory for vision
applications, but at the moment this work is very far from practical
realization. Also, some Korean industry is using vision techniques and
in particular Samsung has a visual soldering inspection system that has
many similarities to systems developed by Toshiba and NTT.
Computer vision is usually though of as beginning with a real world
situation (scene) which is input through a camera and then digitized, and
ending with a description of the scene, e.g., knowledge (perhaps leading
to action which changes the scene, etc.). The in-between steps are often
modularized. Typically there is a "camera model" relating to color,
range, separation of cameras for stereo images, and other parameters,
which is used at the input phase. Next, properties of the image are
invoked to locate image features such as edges, lines, or regions. This
phase is usually called feature extraction. Lastly, at the highest level,
there is some underlying object model, for example the designer knows
that the scene is supposed to be of an automobile, and then matching is
done to locate these objects in the scene. This involves solving problems
such as direction, angle, and occlusion. The result is scene description
or scene knowledge. Research in computer vision is often
compartmentalized into subtopics that follow this modularization as well.
For example, "image processing" usually refers mostly to the lowest
levels, whereas pattern matching research almost always refers to the
highest level.
In computer vision research it is not too difficult to get to the
leading edge of what has been accomplished, and thus almost any project
will quickly need to address advanced problems. But simply put, because
the Japanese have tried so many different approaches, their breadth of
research experience is very much greater than the Korean's. They are also
trying deeper and more sophisticated techniques, although the disparity
might not be too great in a few specific promising areas such as scene
identification.
JAPAN AND KOREAN COMPUTER VISION SUMMARIZED
>From the Japanese side
Prof Yoshiaki Shirai
Dept of Mach Eng for Computer Controlled Machinery
Osaka Univ
Suita, Osaka 565 Japan
Tel: +81-6-877-5111 ext 4706, Fax: +81-6-876-4975
Email: SHIRAI@CCM.OSAKA-U.AC.JP
presented a clear summary of past work in Japan. Shirai pointed out that
Japan has a Computer Vision Group with about 500 Japanese members. They
meet bimonthly and had their first symposium this summer (this is in
addition to any international meetings that have been held). The Group's
chair is Prof Yachida, mentioned above. There is a SIG in Pattern
Recognition and Understanding (until recently Pattern Recognition and
Learning) sponsored by the IEICE (Institute of Electronics, Information
and Communication Engineers) which publishes about 125 papers yearly in
ten issues. This group also includes a small amount of speech
recognition. There is a SIG CV sponsored by the IPSJ (Information
Processing Society of Japan) focusing on image processing that publishes
about 60 papers each year in a bimonthly journal. Finally there is also a
SIG Pattern Measurement sponsored by SICE (Society on Instrumentation and
Control Engineers), publishing about 20 papers yearly in four issues, but
this is heavily oriented toward very practical hardware problems.
A survey of the database of information processing literature in Japan
(this covers the period 1986-1988, the latest data that is available)
characterizes computer vision related papers as follows (excluding coding
of images).
Number of Papers
Applications 477
Drawings 96
Medical 85
Characters 81
Industrial 75
Scientific 64
Remote sensing 40
Intelligent robot 36
3-D input and recognition 132
Feature extraction 105
Hardware systems 89
Image understanding 82
Stereo (or multidimensional) 62
Time sequence images 46
Image database 45
Shirai pointed out that in a few areas, such as industrial applications,
there is far more work than is represented by the number of published
papers.
The only more recent data is from the IPSJ's SIG CV for 1990-1991
Time sequence images 18
Feature extraction 12
3-D input and modeling 10
Stereo 9
Medical 8
Matching 6
Neural network for matching 6
Shape from X 5
Face 4
It is clear that the most important new area is analysis of sequences of
images, and this view was also shared by the Korean attendees. While
there are only four papers concerning computer vision in the field of
human faces, this is also seen to be a growing area, incorporating human
computer interface, remote teleconferencing, human emotional information
processing, and image coding.
Shirai went on to describe several specific Japanese projects that
involve computer vision. The most elaborate of these is the Y20B ($140M
US) 1983-1991 "Robot for Extreme Work" project, in which the ultimate
application is the development of an autonomous teleoperated robot for
nuclear plant, pipe cleaning, underwater, and emergency (such as fire)
operation. Thus particular project involves much more than just computer
vision, and in fact research has been done on fundamental problems of
locomotion, tele-existence, manipulation, and sensing, as well as the
development of a system integration language. The part of the project
dealing with these fundamental issues actually received the bulk of the
funding, and more applied aspects, i.e., to really develop such a robot
were not so well funded. In addition to Japanese universities, ETL,
Fujitsu, Toshiba, and other companies participated--Toshiba working on
feature extraction, and Fujitsu on projecting images onto a sphere (which
Shirai claimed works well in clean environments). ETL has done a great
deal of work on sensing, stereo, and robot vision language development
and actually issued a special issue of the ETL Bulletin on Robotics in
which this has been summarized. He showed several photos of the
prototypes that had been developed. One of these looked like a monster
from Star Wars II, and Shirai admitted that 8 years was a long time for
this technology and that a newer project would have designed a less
clumsy looking robot.
Another interesting Japanese project is a vision-based vehicle system.
This shares some of the same goals as similar projects in the US, such as
at CMU. The Japanese project (which is also supported by MITI) is in two
phases. The initial or phase-0 part was mostly done by Fujitsu and Nissan
around 1989, and involved a vehicle on a special test course, shadowless
illumination and only large obstacles. The vehicle (a motor home) has
three cameras for lane detection and two more for obstacle avoidance, and
a sonar system. Techniques used are line following for lane finding, and
sonar for obstacles and for range finding. Phase-1 which runs from 1989
to 1995 involves learning how to run the vehicle on a highway with a
centerline by distinguishing line and road boundaries, and also road
signs. Phase-2, from 1995 to 2000 will deal with multi-lane highways,
tunnels, rain, windshield wipers and using stereo for obstacle avoidance.
Phase-3, from 2000 to 2030 will (hopefully) deal with normal roads,
crossings, parking, backing up, using a mirror and involve tools of scene
understanding and map matching. This project also has a very unique
perspective on wanting to use active sensing, for example to help the
scene understanding by using sound, and to understand the sounds being
received by use of the input visual data. Thus the project designers are
thinking about sensor fusion and multi-sensor integration. These parts
of the program will begin soon at Tokyo University. Shirai admitted, that
at the moment image segmentation was one of their most difficult
problems, but he did show us some film of the motor home on its test road
and it seemed to be working, although rather slowly. This appeared to be
at a much less advanced state than the CMU project I saw more than a year
ago.
>From the Korean side
Prof Chan Mo Park
Professor and Chairman, Dept of Computer Science & Eng.
Pohang Institute of Science and Technology
P.O. Box 125, Pohang, 790-600 Korea
Tel: +82-562-79-2251, Fax: +82-562-79-2299
Email: PARKCM@VISION.POSTECH.AC.KR
gave a summary of computer vision activities in Korea. Until very
recently there was not much to report, and even today he emphasized that
industrial applications are very limited. Most research is occurring at
Universities and government research institutes using facilities imported
from other countries. Several Korean companies do market low-price
machine vision systems developed in Korea, but, to date, their
performance has not been impressive. Production line utilization of
computer vision is infrequent, and limited to simple inspection and very
repetitive tasks. Park claimed that Korean companies would rather not
purchase a general purpose vision system such as a Sammi-AB, but prefer
to obtain very task-specific systems. Industry does see a very strong
need for efficient algorithms for segmentation, classification, and of
course for high reliability.
Before 1989 work was very scattered and mostly restricted to workshops
and courses in medical imaging, CAD/CAM, image processing and computer
graphics. Modern work really begins only in 1989 with an Image Processing
and Image Understanding Workshop (at Postech) at which time it was
decided to have annual workshops in order to share research activities.
Subsequently, two workshops have been held with a total of 42 papers
presented. Two related meetings are worth mentioning, an International
Workshop on Medical Imaging (March 91 at Korea Institute of Science and
Technology), and a Chapter Meeting of the Korea Information Society (May
1991 at Chung-Joo Univ) which had as its theme Current Status of Pattern
Recognition Technology, and generated half a dozen overview papers. There
are now three SIGs interested in vision, SIG AI (Korea Information
Science Society), SIG IP-TV (Korean Institute of Telematics and
Electronics), and SIG Images (Korean Institute of Communication Science).
Park also gave a list of research activities at various Korean research
centers but did not go into detail about the projects. I have attached
to this report a summary of his list because it gives a realistic sense
of the work going on in Korea. Because the data was collected by asking
scientists, the amount of thought and detail provided varies greatly (how
many PC's does a Cray-2 equate to). But by scanning this, it is very
clear that there are only a very few places with substantial equipment
resources with respect to vision. I will try to obtain more details about
the actual progress of the research at those institutes. Park did
show AVIS, a project at Postech, which is an automated inspection system
for use in the Pohang steelmaking factory using the PIPE computer
developed at KIST.
For the future Park felt that vision work should concentrate on factory
automation, that biomedical applications were still a promising field
that could have broader applications, and that handwritten character
recognition was the key to office automation applications. In the area of
more fundamental research he felt that Korean scientists should work on
moving target detection, remote sensing, mobile robots and other motion
related problems, and that the Korean government needed to take a more
active role with additional funding, manpower development, and mechanisms
to encourage cooperation between industry and university, as well as
international cooperation.
PANEL DISCUSSION: APPLICATION OF COMPUTER VISION FOR AUTOMATION
This was the most fascinating part of the meeting, as it placed six
experts together and gave each an opportunity to describe work that they
had seen and work that they were hoping would be done in the future.
Panelists were
Dr. Sung Kwun Kim
Executive Director
Robotics & Automation R&D Division
Samsung Electronics
259 Gong Dan-Dong, Gumi, Kyung Buk, Korea
Tel: +82-546-460-2015, Fax: +82-546-461-8038
Prof Jong Soo Choi
Dept of Electronic Engineering
Chung-ang Univ
221 HukSeok Dong, DongJak Gu, Seoul
Tel: +82-2-815-9231~7 Ext 2235, Fax: +82-2-815-9938
Prof Kwang Ik Kim
Dept of Mathematics
Pohang Institute of Science and Technology
PO Box 125, Pohang, Kyung Buk, 790-330 Korea
Tel: +82-42-562-79-2044, Fax: +82-42-562-79-2799
Email: KIMKI@VISION.POSTECH.AC.KR
Dr. Takeshi Shakunaga
NTT Human Interface Laboratories
NTT
3-9-11, Midori-cho
Musashino-shi, Tokyo 180
Tel: +81-422-59-3336, Fax: +81-422-59-2245
Email: SHAKU@NTTARM.NTT.JP
Dr. Johji Tajima
Research Manager, Pattern Recognition Research Lab
NEC C&C Information Technology Research Labs
1-1 Miyazaki 4-chome, Miyamae-ku, Kanagawa 216 Japan
Tel +81-44-856-2145, Fax: +81-44-856-2236
Email: TAJIMA@PAT.CL.NEC.CO.JP
Dr. Yoshinori Kuno
Senior Research Scientist
Toshiba Corporation
Information Systems Lab
Research and Development Center
1, Komukai Toshiba-cho, Saiwai-ku, Kawasaki, 210 Japan
Tel: +81-44-549-2241, Fax: +81-44-549-2263
Email: KUNO@ISL.RDC.TOSHIBA.CO.JP
The panel was chaired by
Dr. Masakazu Ejiri
Senior Chief Scientist, Corporate Technology
Hitachi
Central Research Laboratory
Kokubunji, Tokyo 185 Japan
Tel: +81-423-23-1111, Fax: +81-423-27-7700
Unfortunately, none of the panelists provided hand-outs and so my summary
below is based on notes that may not be completely accurate.
Ejiri (Hitachi) only made a very few remarks, but pointed out that vision
systems were realized in Japan 20 years ago. (See my comments earlier
about depth and breath of research vis-a-vis Japan and Korea.) He also
pointed out that there was very tough competition between Japanese
companies, but very friendly discussions between researchers. (Isn't this
the Japanese way; maybe this is the reason that everybody's soldering
inspection systems look alike.)
Kuno (Toshiba): Claimed that more than 100 computer vision applications
were developed at Toshiba. Not all were successful and most were
developed for position detection. The ones that work have a common thread
that they begin with a good (high contrast) image input. He mentioned
three specific examples of vision systems now in use within Toshiba but
did not go into any real detail about any of the specific hardware or
software techniques that were used.
* Soldering inspection system for the mounting of ICs onto PCBs.
In some sense this is a very simple problem, as there is a
clean model of what the image is supposed to look like. The
hard part of this problem is to get good input images.
Toshiba's inspection station uses 168 leds to illuminate
different parts of the target.
* Agricultural automation. This involves using a robot to cut
young plants at the proper stem length.
* Digital Audio Tape, and VCR, magnetic head gap-width adjusting
system using computer processing of images of Moire patterns.
Kuno commented succinctly about the state of the art, that "we are using
'70s algorithms on '80s hardware". As for the future he felt that there
would be no general purpose vision system in the near future because of
cost issues. In his view there are three basic ways to use computer
vision systems.
* Use simple (e.g., low cost) vision system cleverly for factory
automation, human computer interface, etc.
* Apply heuristic tuning to fields with strong needs, e.g.
character scanning/recognition is a perfect example.
* Do basic research on sophisticated vision systems for future
applications, such as robots, nuclear power plants, space work, etc.
Presumably Toshiba's research support will follow these paths.
Tajima (NEC): He felt that for image processing (as opposed to image
understanding) there were already very cheap general purpose systems with
many operators built into hardware for preprocessing (such as
thresholding, etc.). He then went on to give a rapid description of a
collection of vision applications within NEC, again with few details.
* Multi-layer substrate inspection station to detect short
circuits, pin holes, etc., for use with the boards NEC uses on
their supercomputers (SX series). This system can inspect
225mm^2 board area in 25 minutes.
* Soldering inspection station, looking a great deal like
Toshiba's, with five cameras and lights for 3D views.
* Deformation measurement by laser range finding for circuit
boards.
* Inspection system for determining if foreign objects are inside
of empty (Coke) bottles, and another system for determining the
amount of liquid in a bottle.
* A 3-D human body measurement system. This was the most
intriguing of the lot. The application here is to determine the
tailoring of apparel, by measuring cross sections of humans.
Subject is in a box and is illuminated by six lasers. The
software uses a body model which runs on a workstation, and a
database that runs on a minicomputer.
As far as industry was concerned Tajima felt that the important work
needs to be done in 3D recognition as well as motion detection, and that
recognition of features needs to be above 99% to be industrially useful.
He felt a standardized database of images would be very helpful for
studying algorithms. As far as new directions, he mentioned the
importance of sensor fusing to enhance the reliability of existing
techniques.
Shakunga (NTT): Claimed that NTT was trying to combine visual information
processing with other technologies to develop applications in the area of
advanced visual telecom services and network support, both of obvious
importance to NTT. He gave two examples.
* Maintenance. A manhole facility inspection system using a truck-
mounted underground-looking radar that eliminated the need for
digging to locate pipes. This uses pattern recognition and
frequency domain analysis. This is said to work to 1.5m depth,
which includes 75% of company's pipes. (If you have ever lived
in "dig we must" NY you will know how welcome such a system
would be.) A second system uses a TV camera on a stalk that
looks inside manholes and generates a stereo view of the
facilities layout inside (using vertical camera movement for
the second image) and then generates a drawing of the manhole
contents. This uses edge detection which is said to be accurate
to 0.05 pixels.
* Human computer interface. The idea is to transmit less feature
data for teleconferencing. NTT has been experimenting with
human head, lip, and body image readers. The idea is to
interpret head motion and generate understanding based on head
movement. This uses edge detection of head silhouette, and
analysis facial area.
Shakunga divided future research themes in three directions.
* Early vision. Because human systems are very adaptable, we
should study adaptive tuning of input data, and attention
getting mechanisms. Study human implementation of early vision
algorithms for edge, region, texture, shape, distance, motion, etc.
* Middle level vision. Requires research into model based
matching, from specific (recognition) to generic (cognition).
* High level vision. Study 3D world description and manipulation.
Consider integration of vision and semantic databases.
Kim (Samsung): Felt their problems were similar to NTT's and to
Toshiba's. Also felt that the cost of vision systems will be coming down
quickly, although this is now still a bottleneck. He gave a short list of
computer vision applications but with even fewer details than the other
industrially based speakers.
* System for mounting a screw in a microwave oven cavity.
* Simple assembly.
* Soldering system for inspection and modification, again very
similar to NEC's and Toshiba's.
* Color monitor.
* Mobile navigation.
The motivation for reducing the cost of inspection was made clear to the
audience---Kim pointed out that at Samsung a very large fraction of the
electronic manufacture employees are doing inspection and adjustment
related jobs.
Choi (Chung-ang Univ): Described work in 3D vision. Of course the major
problem is to extract information about a 3D world from 2D images.
This can begin with range finding, for knowing the distance to objects will
then allow one to determine which one is in front, etc.; or it can begin
with finding segments, features, objects,... to which stereo etc. can be
applied. (An occlusion boundary, for instance, allows triangulation on the
occluding edge---it is not a feature of the occluded object.)
The two approaches are
* Passive
Monocular vision requires a-priori information (in some
problems this is available).
Photometric stereo, e.g. using different light sources
Shape from shading, although recovering surface orientation
from a single image is obviously ill posed as are many of the
monocular techniques.
Range data from two different images, or a sequence
* Active
Structured lighting (mentioned work by Prof Sato, and also
work at CMU)
Time of flight (sonar, etc.)
Conventional triangulation as with rangefinders.
None of the techniques is best for all situations and ultimately the
system designer must choose the most appropriate. Systems with low cost and
high performance are not available, certainly not in a factory situation.
There are no camera standardizations. Missing scene parts and shadowing are
a problem, as obviously it isn't possible to deduce 3D data for missing parts
of a scene.
======================================================================
COMMENTS ABOUT SPECIFIC CONTRIBUTED PAPERS.
A complete list of titles/authors of the presented papers is being
prepared and will be distributed as soon as it is ready. However
topics discussed included
Character recognition & document understanding
Image processing & coding
Hough transform
Scene understanding & object recognition
Neural nets
Stereo & shape recognition
Motion & sequential image processing
Sensing & recognition
Mobile robots
Vision paradigm
Computer vision hardware and algorithms
Motion & shape reconstruction
Intermediate & high level vision
Thinning, quadtree & component labeling
3-D modeling & recognition
Feature extraction & object recognition
Applications
I would like to express my sincere thanks to
Prof Timothy Poston
Department of Mathematics
Pohang Institute of Science and Technology
PO Box 125, Pohang, Kyung Buk, 790-330 Korea
Tel: Korea +82-42-562-79-2052, Fax: Korea +82-42-562-79-2799
Email: TIM@VISION.POSTECH.AC.KR
who contributed the following material. Readers should note that many
important Japanese research topics on computer vision were not presented
here.
It has been said that if all a rat knew of rat psychology was the
information in psychology textbooks, he would fail every social
interaction he attempted with another rat. Similarly, if the processing
of his visual input rested on current algorithms for vision, he would be
safest to rely on his excellent sense of smell. Broadly speaking, most
computer vision applications depend on an extremely predictable
environment; "is that a nut or a bolt?" algorithms that depend on
consistent lighting and would often report "bolt" for a severed finger.
The highly stereotyped behavior of an animal adapted to cage life (and
no longer viable in the wild) is richness itself compared to any
manufactured system. Since back-of-the-envelope calculations suggest
that the processing capacity of the current generation of supercomputers
is up there with the nervous system of a housefly, it is a remarkable
fact that progress is in fact being made in solving visual tasks far
more interesting to humans than anything a fly can do.
This meeting was reasonably representative of the state of the art. For
example, one Korean paper[1] at this meeting reported on a system for
extracting car types and license plate numbers from camera images, that
is in place and working well in its limited universe of car and plate
types. The problem of workaday character recognition is a much larger
one in East Asia than in pure-alphabet countries (though even there
decorative scripts, from Shatter to ornamental Arabic, make a universe
too wild and varied for existing computer methods). A Japanese high
school graduate is supposed to recognize about two thousand Chinese
characters; a Korean who knows only the phonetic script is functional,
but cannot read (for instance) most newspaper headlines. Identifying
characters from a universe of thousands, even in a fixed typeface, is a
qualitatively different problem from working with Western character
sets. Just as with English writing, handwritten text has far more
variation and consequent difficulty. Thus to achieve over 98%
cumulative accuracy on a realistically large character set is not a
small achievement. This was done by two Japanese papers in radically
different ways; one[2] used Fourier transforms of rectangular windows
within a character to estimate how like a diagonal/vertical/etc. stroke
that part of the character seemed, tested on 881 character categories
from a standard data base of handwritten characters. The other[3]
worked on the cheap printing in a translated Isaac Asimov novel
(processing it in about the time Asimov seems to need to write one),
which involved 1164 distinct characters. This paper used a more
directly geometrical approach, searching for pieces of approximate
straight line within the image, calculating their lengths, and so on.
Many other methods are under development (some of which look unlikely
ever to scale to a large character set with good reliability); this
contention of innumerable ideas reflects the direct importance of the
problem, its attraction to vision researchers as a problem universe of
large but controlled size, and the lack of conceptual convergence in
this area. There are so many uses for automated literacy that effort
and resources will continue to pour in this direction, but it would be
unwise at this time to place any bets on what method---existing or still
to be developed---will finally dominate the field of character
recognition.
In any meeting about computer analyses and decision-making, nowadays,
one expects neural networks. At this conference there were five, using
networks for [4] identifying objects in an image ("Choose one out of
Sky/Grass/Road/Tree/Road/Car"), [5] segmenting simple images ("Separate
this stool sketch from the background sketch of floor and folding
screen"), [6] stereo matching, [7] an image thinning method, and [8] a
classifier for polyhedra with up to eight faces, at most four meeting at
a point. As is common for neural net research, the problems handled
were quite small, and while directions for development were pointed out
there was no analysis of the way the network's necessary size and
learning time would scale with the complexity of the problem. In most
network problems, unfortunately, these scaling properties are abominably
bad, so that the network `solution' is no better than a classical
algorithm that takes exponential time or worse, except for the
`learning' that reduces the load on human intelligence in creating the
solution. Some of the papers here may scale usefully---some neural
networks are proving useful in practical applications---but none of
them address the question.
The enormous range of methods applied to scene analysis (optical flow,
modelling of the object seen and comparison of the image with
prediction, fitting a distorted quadric surface, analysis of a moving 3D
outline, shape from shading...) generously represented at this meeting
represent not only the immaturity of the field (as with character
recognition), but almost certainly the multifaceted natured of the
problem. The human vision system can respond "couple dancing!" to a
grey-toned image, a line sketch, a half-second movie showing only points
of light attached to dancers in the dark... and thus solves its
problems in multiple ways. This multiplicity is presumably in some
sense necessary, as the evolutionary cost of evolving it cannot have
been low; complicated systems have many potential defects, so that many
mutations could cripple them, and very few---at a given moment---improve
their present working. The papers here represent normal progress in
existing approaches to subproblems in the Great Problem of "What am I
seeing?"; a number of papers that specialists will need to read, but
nothing that starts a whole new approach, or represents a step toward
the problem created by the multiplicity itself. Given that a robot fit
to explore a rich environment will almost certainly need (like the
mammalian brain) to use many submethods in visual analysis, how should
it integrate their results? How should the computer/how does the brain
represent the objects about which so much information arrives in
conflicting formats? As each submethod becomes more powerful the
problem of integration or "sensor fusion" becomes more urgent. Since
major progress here would be a large step toward understanding the
dynamics of consciousness, it is not a trivial problem. Not surprisingly,
at this meeting there was no session on integrating the output of the
descriptors for rigid shapes, faces, etc. discussed in the many papers
on how to use camera images, range data, and so forth.
As one might expect, given the respective research populations and
funding of Japan and South Korea, there were 47 papers from Japan
against 33 from the host country, of which a certain number were `trial
flights' by graduate students giving their first conference papers. In
some cases, this was painfully obvious in the quality of the work as
well as in the confidence of the presentation. The experience of
placing work in the setting of a larger and more developed research
effort will certainly be strengthening for Korean work in computer
vision.
[1] Recognition of Car Type and Extraction of Car Number Plate by Image
Processing, Dong-uk Cho, Young-lae Bae and Young-Kyu Kang, SERI/KIST, Korea.
[2] Recognition of Handprinted Characters by FFT,
Tadayoshi Shioyama and Akira Okumura, Kyoto Inst. of Tech, Japan.
[3] An Experiment on Printed Japanese Character Recognition using a PC
for the Braille Translation of Novel Books,
Yasuhiro Shimada and Mitsuru Shiono, Okayama U. of Science, Japan.
[4] Hopfield Net-based Image Labelling with MRF-based Energy Function,
Byoong K. Ko and Hyun S. Yang, Korea Advanced Inst. of Sci. and Tech.
[5] Image Segmentation Using Neural Networks,
Ao Guo-Li, Cui Yu-Jun, Masa0 Izumi and Kunio Fukunaga,
College of Engineering, University of Osaka Prefecture, Japan.
[6] Stereo matching using Neural Network of an Optimized Energy Function,
Jun Jae lee, Seok Je Cho and Yeong Ho Ha,
Kyungbuk Mational University, Korea.
[7] Automatic Construction of Image Transformation Processes Using feature
Selection Network, Tomoharu Nagao, Takeshi, Agui and Hiroshi Nagahashi,
Tokyo Institute of Technology, Japan
[8] 3-D Polyhedral Object Recognition using Fast Algorithm of Three Dimensional
Hough Transform and Partially Connected Recurrent Neural Network,
Woo Hyung Lee, Sung Suk Kim, Kyung Sup Park and Soo Dong Lee,
Ulsan University, Korea.
======================================================================
RESEARCH ACTIVITIES AT KOREAN UNIVERSITIES/RESEARCH INSTITUTES
IN COMPUTER VISION
Data collected by - Prof Joon H Han, POSTECH
Prof Hyun S Yang, KAIST
Kyungbuk National University
A. Research Areas
Computer Vision Stereo vision
Pattern recognition
Range image analysis
Motion estimation
Image Analysis Restoration
Enhancement
Edge extraction and thinning
Segmentation
Data compression
Neural Network Pattern recognition
Stereo vision
Image analysis
B. Projects (Partial list)
3-D object recognition from 2-D images
Development of shape recognition and synthesis technology by
using image processing techniques
IC layout pattern recognition by using image processing techniques
3-D shape and motion recognition
Studies on human vision
C. Facilities (image Processing Lab)
Color Image Processing System
(1) IBM PC/AT with color image processor
(2) Color CCD Camera (512 x 512 x 8bits)
(3) Color Monitor
Pseudo Color/BW image processing system
(1) IBM PC/386 with ITI ljl series image processor
(2) IBM PC/AT with ITEX-PC-Plus
(3) Color CCD Camera (512 x 512 x 8bits)
(4) B/W Monitor
Stereo Vision system
(1) IBM PC/AT with FG-100-AT frame grabber
(2) Two CCD Cameras
(3) B/W Monitor
Laser Range Scanner System (Technical Arts)
(1) 1OOX Scanner
(2) Solid State CAMERA ICIDTEC)
(3) RCA Monitor
(4) Laser power Supply
(5) Visual 500 Terminal
SUN 4/260C Workstation
Color Graphic System
(2) Color monitor (1280 x 1024)
(3) Digitizer Tablet
(4) Plotter
IR Camera
Film Recoder
Printer
D. Faculty - Prof Sung Il Chien
Sogang University
A. Research Areas
Character recognition, Stereo vision, Image coding
B. Faculty Members
Prof Kyung Whan Oh
Prof Rae Hong Park
Seoul National University (Signal Processing Lab)
A. Research Areas
Image coding 2nd generation coding, region based coding
texture analysis/synthesis, motion compensated coding
motion detector, target tracker (real-time)
Computer Vision: Low-level
segmentation (color, B/W), shape matching (relaxation),
polygonal approximation
B. Facilities - Gould 8400 IP + 19" RGB monitor, Micro-VAX II
Image Technology IP512 Image Processing System
SNU RGB Image Processing System, PDP 11/23, IBM PC 386, AT, XT
C. Faculty - Prof, Sang Uk Lee
Seoul National University (Automation & Systems Res Inst)
A. Research Areas - Computer Vision (low and high level)
B. Current Projects - On the development of a color vision system employing DSP
Real-time vision system
C. Facilities - SUN 4 workstations, CCD camera, Adaptive robot,
IP 150 Image Processing System IBM PC/AT, 386 etc,
D. Researchers - Prof Sang Uk Lee (SNU)
Prof Jhong Soo Choi (Chung-Ang Univ)
Prof Rae Hon Park (Sogang Univ)
6 Research Assistants
Yonsei University
A. Research areas
Neural network modeling
Korean character recognition
Dynamic character recognition
Korean character processing computer
B. Facilities
Micro-VAX II, Vax II/750, Solbourne, IBM PC/AT
Scanner, B/W Camera, Printers
C. Researchers
- Prof Il Byung Lee
- Graduate students (approx 10)
Chung-Ang University (Image & Information Engineering Lab)
A. Research Areas
Medical ultrasound imaging, Computer vision, Visual communication
B, Current Projects
A study on image understanding system using active focusing
and meta-knowledge
C. Facilities
Workstations, PC's 132-bit, Image data acquisition system,
Plotter, Logic analyzer, IBM 3090
D. Researchers
Prof Jong-Soo Choi, one Assistant Professor
Graduate students (17)
Chung-Ang University (Computer Vision & Graphics Lab)
A. Research Areas
Computer vision, Image understanding, Pattern recognition,
Computer graphics
B. Projects (partial list)
Construction of image understanding system
Basic research on image processing
Basic research on artificial intelligence
C. Facilities
CCD Camera, Frame grabber, PC's
D. Faculty - Prof Young Bin Kwon
Choongbuk University (AI lab)
A. Research Areas (Projects)
Development of on-line hand written Chinese character recognition
Evaluation of image skeleton algorithms
B. Facilities
SUN SPARC workstations, Macintosh workstations,
VGA color PC 386, VGA color Notebook 386,
WACOM SD-510C tablet digitizer, Laser beam printers, IBM PC/AT's
C. Researchers
Prof. Seong Whan Lee, Graduate Students 5, Research Scientists 2
Korea Advanced Institute of Science & Technology (KAIST)
(Visual Information Processing Lab)
Center for Artificial Intelligence Research
A. Research Areas
Sensor-based intelligent mobile robot
Knowledge-based computer vision
CAD-based computer vision
Neural network-based computer vision
Character recognition and document understanding
Computer graphics
B. Current Projects
Development of intelligent mobile robot
Knowledge-based image understanding system for 3-D scene
interpretation
CAD-based 3-D object representation and recognition
C. Facilities
PIPE 1/300 Image Processing Super Computer, Mobile robot
SUN workstations, IBM PC 386, CCD video cameras,
400-dpi image scanner, X-terminals, VGA monitors,
21" High resolution color monitor
D. Researchers
Prof Hiyun S Yang, KAIST faculty 2
Other university faculty 7, Graduate students 12
KAIST (Image & Communication Lab)
A. Research Areas
Image coding and transmission, Color Image Processing,
Image understanding and applications, Dynamic scene analysis,
Channel coding and wideband systems, Character recognition,
Hierarchical coding, 3-D image processing,
Texture image processing, Shape recognition algorithms
B. Facilities
Micro-VAX 11, SUN workstation, Gould IP8400 and Real-time video disk
KAIST vision system, Color camera for TV broadcasting
C. Researchers
Prof Jae Kyoon Kim, Prof Seong Dae Kim, Graduate students (approx 30)
Pohang Inst of Science & Technology (Postech)
(Computer Vision Group)
A. Research Areas
Image processing algorithm developments for parallel processing
computer (POPA),
Pattern classification using computer vision, 3-D object modeling
Stereo vision, Korean character recognition using neural networks,
Image understanding, Robot vision, Medical imaging
B. Projects (partial list)
On-line detection of surface faults on metal plates
Autonomous land vehicle navigation
Development of slab number recognition system by
using an image processor
Development of range image sensor
Development of high-performance parallel computer for
image processing
Wavelet transforms for image processing and image understanding
Development of robot vision system for automatic assembly of
automobile parts
C. Facilities
POPA (Pohang Parallel Architecture, transputer-based)
PIPE model 1 system (image processing and understanding system)
Gould IP 9516 image processor (micro-VAX host)
ITI l5l image processing system, Solbourne 5/602, Sun workstations,
Neuralworks, ANZAplus Neurocomputer, CCD camers,
Color image scanner, HP Plotter, Color monitors,
White light projection system, PC's, PUMA-560 Robot, NOVA-IO Robot
D. Faculty
Group Leader: Prof Chung Nim Lee
Korea Institute of Science & Technology (KIST)
Systems Engineering Research Inst (SERI)
Computer Vision Group
A. Research Areas
Image Processing System and Applications
Satellite data processing, Computer inspection,
Medical imaging, CAD/CAM and Graphics, Automatic character
recognition, Fingerprint recognition
Research on basic software for remote sensing utilizing image
processing systems
B. Projects (partial list)
Image processing system with microcomputer,
Development of vectorized image processing software for the
processing of remotely sensed data,
Development of automatic testing system using computer vision
and AI techniques
Development of weather satellite data analysis technique and
workstation software for image processing
Development of automatic car license plate recognition system
C. Facilities
SUN workstations, IBM PC/AT based graphics systems,
IBM PC-386 based Graphics system, CRAY-2S/4-128, CYBER 960
Micro-VAX 11
D. Researchers
PhD - 4, MS - 4, BS - 5
------------------------------
End of VISION-LIST digest 10.48
************************