The vOICe for Camera Phones with Java ME

By Laura Brown,2014-01-20 04:22
8 views 0
The vOICe for Camera Phones with Java ME

    ? Peter B.L. Meijer 2009. All rights reserved.

    The vOICe for Mobile Camera Phones

    For information about installing the software, please first visit the web page

    In reading this manual, it is assumed that you have already succeeded in

    123installing and running The vOICe seeing-with-sound MIDlet. The purpose of this manual is mostly to discuss the available key

    commands in more detail. You may wish to use a

    headset in order not annoy other nearby people

    with the rather unusual and hence attention-

    drawing sounds from your phone. Moreover, a

    stereo headset is advised for use with phones that

    offer stereo sound capabilities4. The vOICe’s options menu - called up with the phone’s Options

    key - contains a “Channels” entry that allows you

    to select mono (default), stereo or 3D audio

    channels. The stereo and 3D audio options ease

    the perception of The vOICe’s visual left-to-right scanning, but they may give

    severely distorted mono sound on phones that lack stereo capabilities. Just

    check what works for your phone. Also, some phones contain built-in stereo

    speakers on their side5, and the “View rotation” entry in the options menu can be used to correspondingly adapt the camera view when holding the phone

    with left and right speaker on the left and right, respectively.

    Note that while using a screen reader it may be necessary to first mute The vOICe (with key "0") to hear the screen reader speak all menu and submenu items under the phone’s Options key. After changing settings, you can then unmute The vOICe again with key "0".

    Once started, The vOICe MIDlet continuously grabs and sounds live

    snapshots from your phone camera. There are no connection costs while

    using it, because The vOICe MIDlet runs off-line. Each camera snapshot is

    sounded via a left-to-right scan through the view, while associating height with

    pitch and brightness with loudness. By default, a black-and-white camera view

    is sounded in just one second. For example, a bright rising line on a dark

    background sounds as a rising pitch sweep, a small bright spot sounds as a

    short beep, and a bright filled rectangle sounds as a noise burst. The vOICe’s

    1 The vOICe MIDlet runs on MIDP-2.0 and MMAPI compliant camera phones. Note that on many phones you must have the "Warning tones" setting in your active profile turned On, or else The vOICe may not sound properly, if at all. Also, with many Nokia phones you should after installation (also of an upgrade) select Tools | (App) Manager | The vOICe | Options | (Suite) Settings | Multimedia | Ask first time, or else user permission may be asked for every single camera snapshot. Other phones may hold similar settings. Depending on the phone, the application Manager may be found in the main application Menu or under Menu | Tools. 2 Some phones automatically launch their built-in camera application upon opening the camera cover, and in that case you first need to close that other camera application before The vOICe can access the camera. This is because only one application can access the camera at any given time. Of course you should not forget to slide open any lens cover in the first place, or else the view will remain “black” because no light can enter the camera. 3 On some phones, user permission to control (and turn off) the camera shutter sound will be requested after startup of The vOICe. 4 For example the Nokia 6620, Nokia 6630 and Nokia 9500 camera phones have stereo sound capabilities. 5 For example the Nokia N82. The N82 may be held at head level with the two speakers facing upwards.

    ? Peter B.L. Meijer 2009. All rights reserved.

    simplest application is as a light probe, but it is actually far more powerful because its changing polyphonic visual sounds or “soundscapes” now also

    track position and shape of objects, even with multiple objects within your camera view. Thus it allows you to locate light sources, recognize basic image patterns such as stripes and various textures, find borders, identify shapes, and so on. In addition, The vOICe MIDlet offers a number of color detection features, and includes a talking color identifier.

    Many settings are accessible through the application’s Option menu if you have a phone screen reader, but there is also built-in speech support for the main features. For compatibility with phone screen readers, The vOICe supports two menu styles: the "Textual" style for the submenus is only advised for use with the old Talks 1.40.1 (to avoid crashes), while the "Normal" style is advised with Mobile Speak as well as Talks 2.0 and later. In addition, a number of keyboard shortcuts exist for direct access to various features, and we begin with a brief overview of the main key commands. At the end of this document a table overview is given.

    The "0" key toggles the muted state. Pressing this key twice in rapid succession, like “00”, toggles a muted paused state that minimizes CPU load while releasing the camera resource for best responsiveness when accessing the menus with a phone screen reader. The "1" key toggles the negative video mode, which can help to see/find small or thin dark items on a bright background. The "3" key toggles the built-in speech off and on. The "7" key toggles a mode that helps prevent visual sound stuttering and buzzing on devices that cannot handle simultaneous visual sound rendering and playing. You can try it to find out what works best with your phone. The "9" key cycles over different contrast enhancement modes. The "*" (star, asterisk) key toggles the talking color identifier on and off. The "#" (pound, hash) key cycles over different sound volume levels when not muted. Other settings are controlled with the joystick. The default audio sample rate is 16 kHz, but lower sample rates can be selected by using the "DOWN" key (joystick down), and higher sample rates can be selected by using the "UP" key (joystick up). Available sample rates are 8 kHz, 11 kHz, 16 kHz and 22 kHz, but phones need not support all of these sample rates. Lower sample rates give lower sound quality, but may make the phone more responsive. The "RIGHT" key doubles the visual sound duration to at most two seconds, while the "LEFT" key halves the visual sound duration to at least half a second. Note that on some phones the "UP", "DOWN", "LEFT" and "RIGHT" keys may be mapped through the "2", "8", "4" and "6" numeric keys, respectively. Many of the program settings persist across multiple runs of The vOICe MIDlet. Now more about the color detection features. As was stated above, the "*" (star) key toggles the talking color identifier on and off. This mobile color recognizer speaks the color of whatever shows at the center of your camera view, while alternating with the visual sound of the camera view that tells you about the shape and brightness of items in your view. If you prefer to only hear the talking color identifier, simply press the "*" (star) key twice in rapid succession, much like a double-click, and you will then only get to hear the color names. So “*” toggles color identification alternating with visual sounds,

    ? Peter B.L. Meijer 2009. All rights reserved.

    while “**” toggles color identification without the visual sounds. Pressing the joystick “Fire” button will speak the color name once, even if The vOICe was

    6muted, and on suitable phones it will use the built-in flash. In any case,

    recognized colors include (dark, normal, and light) red, green, blue, cyan, yellow, orange and magenta, as well as combination colors such as red-orange. Black, grey and white are also identified, bringing the total number of identified colors and shades to 47. Beware that the choice of color names can be culturally biased: cyan is a color in between green and blue, while magenta is basically the same as the color purple. Also, light-magenta and light-red make for the color pink or very similar colors, while dark-red-orange, dark-orange and dark-orange-yellow appear as various shades of brown. Dark yellow-green makes for olive-green.

    Results of color recognition inevitably depend on ambient light and camera quality. Try to use good lighting whenever possible, preferably broad daylight. Still, under relatively low light conditions, better results may be obtained by first calibrating The vOICe for the given visual environment. To do this, point the camera to a known white surface (such as a white sheet of paper) near the object of which you want to identify the color,

    and apply the “Calibrate white” entry in The

    vOICe’s options menu

    7, which will basically tell

    The vOICe that this surface really is white or light

    grey rather than its actual grey or dark grey

    appearance. In fact it will also correct for the

    yellowish colors from incandescent lighting and

    many other sources of color bias. Next you can

    point the camera to other items of interest to

    identify their colors. Apply the calibration option

    with care: only apply it when you are certain that

    the full camera view is indeed white and relatively bright, or else you may get very poor color identification results due to a badly skewed color calibration! Calibration settings do not persist across runs to avoid unintended continued use of a calibration that would no longer match changing ambient light conditions. The vOICe does not normally need calibration in broad daylight conditions, but if applied with care, it can yield significantly more accurate color recognition results under relatively low light conditions. The calibration process takes only about a second and applies for the duration of the run unless you recalibrate or reset The vOICe via its menus.

    The color identifier tells you the color at the center of the camera view, but sometimes you may wish to know where items of a given color are. Rather than pointing the camera around until the color identifier finally “hits” the object with the color of interest, you can tell The vOICe to sound the entire camera view but only sound items of the color that you specified. This is done either via the color filter options in the menus or by keying the first letter of the supported color name, being “r” for red, “g” for green, “b” for blue, “c” for cyan, “y” for yellow, “o” for orange and “m” for magenta. Now you need to know how

6 E.g., the Nokia N70, N90 and N91 support the required Advanced Multimedia Supplements (AMMS, JSR-234). 7 You can also apply long-press "*" (long-press star key) as a shortcut.

    ? Peter B.L. Meijer 2009. All rights reserved.

     8to enter these letters, unless your phone includes a QWERTY keyboard. As

    you may know, letters are associated with keys 2 through 9 on your phone. In

    particular, key “2” holds the associated letters “a”, “b” and “c”, or “abc” for

    short. If you press key “2” once in The vOICe, you specify the digit “2”, but if

    you press key “2” multiple times in rapid succession, you get to the letters “a”,

    “b” and “c”. Pressing key “2” twice means “a”, pressing key “2” three times

    means “b” (which toggles the blue-only color filter), and pressing key “2” four

    times means “c” (which toggles the cyan-only color filter). The same principle

    applies to the other numeric keys. Key “3” holds “def”, key “4” holds “ghi”, key

    “5” holds “jkl”, key “6” holds “mno”, key “7” holds “pqrs”, key “8” holds “tuv”,

    and key “9” holds “wxyz”. Therefore, if you want to see and find only green

    items in your view, you press key “4” twice to specify “g” for green, or if you

    want to see and find only red items in your view, you press key “7” four times

    to specify “r” for red. These functions act like a toggle, so applying the same

    one another time turns the color filter off to return to the normal mode of

    operation. (Alternatively, you may also press key “9” twice to apply “w” for

    white which is equivalent to having no color filter.)

    If you want to run a more complete analysis of what items of what shape and

    of what color show where in your camera view, you can press key “2” twice to

    toggle “a” for “Analyze”, which will then cycle over all available color filters for

    finding any objects and shapes that are red, green, blue, cyan, yellow, orange

    or magenta.

    The combination of color filters with the visual sound bitmaps implies that over

    4000 (namely 64×64) different locations for colored items can be represented,

    while at the same time including shape, shading and texture information - in

    just one or two seconds of sound. The general image-to-sound mapping

    makes that top left gives high pitch early in the visual sound, bottom left gives

    low pitch early in the visual sound, top right gives high pitch late in the visual

    sound, and bottom right gives low pitch late in the visual sound, with other

    positions giving intermediate positions in pitch and time.

Let’s consider an example where the general scanning of the visual sounds is

    combined with color filters to solve a practical problem. Suppose you want to

    know the color of something small or thin, say a thin electrical wire. Then it is

    extremely difficult to orient the camera such that the center of the camera

    view points exactly at this item of interest to get the color identification right.

    However, by using the visual sounds of the full view along with the "Analyze"

    submenu option for filtering colors (keyboard shortcut "a"), The vOICe will

8 The Nokia 9500 camera phone includes a QWERTY keyboard.

    ? Peter B.L. Meijer 2009. All rights reserved.

    filter for each color in turn, such that at some point it only sounds any red items in the view along with saying the color name "red", and any red wire will appear as a single tone going up or down in pitch depending on its visual orientation. Of course such advanced uses may require some practice depending on the exact nature of what you are trying to accomplish.

On suitable phones9, pressing key “p” will save a snapshot picture to the 10memory card. The resulting JPEG format image file contains a numeric 11timestamp in the filename, e.g., as in "vOICe_1155477325843.jpg". The

    timestamp ensures that each snapshot automatically receives a unique

    filename. You may have to give several permissions while saving, depending

    on the phone’s security limitations. The saved image file may subsequently be

    used for many purposes, such as OCR (optical character recognition), or for

    sharing with friends. The file location is either the Images folder or the root of the memory card, depending on the type of phone. After saving the image,

    which may take several seconds, normal operation resumes.

When using the phone camera outside in the sunshine, color readings can be

    badly affected by glare if there is direct sunlight on the phone. In such

    situations, try to cup one hand over the phone without blocking the camera

    view, such that your hand acts like a sunshade - much like a hat can keep

    your face out of direct sunlight.

    Finally, there is support for an additional very special color: skin. Pressing key “1” twice, or “11”, will toggle the skin-only color filter. This will in principle only

    sound any exposed skin in your view, such as faces and hands, which might

    find uses in for instance determining how many people are nearby or locating

    empty chairs in a conference room. The skin color filter also takes into

    account typical racial differences. However, certain materials such as wood

    can have a color that is very similar to skin, in which case you need to also

    take into account apparent shape and size in the visual sounds to try and

    determine for yourself if results of the skin-only filter only show skin. The best way to start and learn is to experiment. There should be a difference with and without clothes on.

In all uses, please stay aware that pointing the phone’s camera at people who

    do not know you or The vOICe, in public places or elsewhere, might trigger

    hostile reactions, for instance because people may think that you are taking

    their photograph without their permission or otherwise invading their privacy. Similar issues may apply when pointing the camera at certain properties.

9 The file I/O standard JSR-75 must be supported, e.g., as with the Nokia 6630 and Nokia 6680. Otherwise the screen will display show a “File I/O not supported” error message (may not be detectable with screen reader). 10 On phones that do not support JPEG, The vOICe will try to save in PNG format. 11 Timestamped filenames can if desired be converted to human-readable dates and times using an online timestamp converter, because The vOICe timestamps are in fact the number of (milli)seconds since January 1, 1970.

    ? Peter B.L. Meijer 2009. All rights reserved.

    0 Toggles muted state Off 1 Toggles negative video Off 3 Toggles speech feedback On 7 Toggles "anti-stutter/buzzing" mode Off 9 Cycles contrast enhancement 100% * Toggles talking color identifier Off # Cycles sound volume levels 50% UP Higher sample rate, up to 22 kHz 16 kHz DOWN Lower sample rate, down to 8 kHz 16 kHz LEFT 0.5 or 1 second visual sound 1 second RIGHT 1 or 2 second visual sound 1 second FIRE [Flash and] say color Off 00 Mute and pause (low CPU load) Off ** Color identifier, no visual sounds Off ## Toggle blinders (narrow view) Off r Red-only color filter Off g Green-only color filter Off b Blue-only color filter Off c Cyan-only color filter Off y Yellow-only color filter Off o Orange-only color filter Off m Magenta-only color filter Off 11 or s Skin-only color filter Off a Analyze colors by cycling filters Off p Save snapshot picture to memory card

    Overview of The vOICe key commands

    ? Peter B.L. Meijer 2009. All rights reserved.

Quirks mode

    You can also experiment with a “bat call” quirks mode toggled by a long-press

    of the FIRE button: this gives you two loud but very brief high-pitched chirps in

    rapid succession, much like an audible version of the clicks or sound flashes

    emitted by bats during echolocation. The double sound flashes may thus be

    used with the phone’s built-in speaker to detect nearby obstacles from any echoes that you hear. The sound flash patterns are repeated with the same

    interval used for the visual sounds, every second by default. If you prefer, you

    can toggle use of single sound flashes by pressing the “1” key while in the bat

    call mode. You can also independently cycle the audio volume of the bat calls

    by pressing the "#" (pound, hash) key while in the bat call mode. If necessary,

    use your hand to form a cone over the phone’s speaker for improved

    directionality of the sound flashes, and hold the phone in a position that is

    consistently aligned with your ears, for instance in front of your face.

    Please note that in general palate clicks made with your tongue may work

    better, because you can more readily adapt these to your current situation.

    Since these tongue clicks originate from your mouth they are also always

    consistently aligned with your ears, such that you can better train for very

    subtle changes as needed for echolocation purposes. On the other hand, your

    mouth may run dry after a few minutes of intensive tongue clicking, so use

    whatever suits you best.

Report this document

For any questions or suggestions please email