The conception of robot that plays soccer was a major step in autonomous multi agent system, many agents that work together to achieve a common goal. This research field was promoted with the introduction of two international events, the Federal International Robot Soccer Association (FIRA) and Robot Soccer Cup (RoboCup). Similar to the real human game of soccer, the player rely the most on his vision as the source of information about his environment and this duplicated into robot soccer, which allows for the use of cameras as the single sensor. This research will experiment five concepts to determine the guidelines needed in the implementation of local monocular vision system. The experiment are (a) the minimum frame rate for the
establishment real-time vision, (b) various color model and segmentation techniques and its efficiency in terms of frame rate, (c) the type of object detected (d) The processing of the video as a complete video frame- full video, separated video for each camera- split video and (e) The influence of displaying the video or not displaying the video, during gameplay, on the frame rate. Based on the findings, the guideline of parameter was established are the minimum frame rate of 16 frames per second (fps). The experimented color models were RGB, HSV and YUV in which RGB showed a mean value of 21.38fps, HSV with a mean value of 16.00fps and YUV with a mean value of 15.60fps. RGB will be the color model of choice as it achieves the concept of real-time video.The objects to be detected are of basic geometry shapes, namely circle, small rectangle and large rectangle, which are currently employed in other robot soccer leagues. In terms of distance estimation, the calibration method proves worthwhile even though it give erroneous result when the object being detect is close range or long range. The way to solve this situation is to use these ranges by creating certain strategies such as near range for defense or kicking. Another issue tackled effects to display the real-time video or hiding (not displaying) the video. The result showed that by not displaying the video the frame rate increases by as much as 3% or an increase of about 0.8fps in average. Being the only sensor applied, the video processing that is required has to have a minimal impact in terms of processing time, which will result in the increase in frame rate. Based on the fmdings of these experiments, a guideline of parameters can be established in the implementation of local monocular vision system, which consist of multiple wireless camera system, processing the ROB or the raw video format coming from the camera, using the full video mode when displaying an processing, lastly to provide an option the display or not display the video hence increasing the frame rate when not needed to debug. Calibration method distance estimation can be utilized but four ranges should be define and strategies set for these ranges.