- Vehicle electronics are growing in value but require further standardisation and reductions in power consumption
- Data storage is a major issue — techniques from traditional big data do not work very well with images and video
- Image recognition is improving but research would benefit from wider availability of labelled video datasets
- Further work is required to create greater depth of scenarios and improve simulation processing times
- Realistic visualisation of simulations for humans is different to modelling the sensor inputs vehicle AI interprets
- Understanding machine learning isn’t always hard… sometimes it comes up with simpler rules than we expect!
Market Growth… is forecast at 6% compound annual growth rate (CAGR) for electronics — reaching $1,600 of average vehicle content. For semi-conductors the figures are even more impressive — 7.1% CAGR. The specifics of market development are less clear — these growth figures include L1/2 systems but not full autonomy. Although there is a definite role for the technology, standardisation is a must, requiring a yet-to-be-established framework. Safety is a big challenge: without clear agreement on what safety level is acceptable, definite technical standards cannot be set. Another open issue is the degree to which the car will have to make decisions for itself versus interaction with infrastructure and other vehicles. The problem is the latency (response time) of large data sets. Finally, self-driving chipsets must consume significantly less power than current prototypes.
Researchers have gained new insights by translating real world crash data into a virtual environment… the information came from records collected by the regulators. Technology in production today sometimes makes facile errors (e.g. lane keeping recognising a bike path rather than the kerbside). Research has shown that it is possible to correlate the virtual models with real world data (for instance replicating a collision with a pedestrian) but the challenge of testing thoroughly remains substantial. Multiple different environments are needed; there are thousands of types of crash situation; and each vehicle has unique attributes. Through all of this, it is vital that the results correlate to the real world. Researchers aim to reduce modelling times from days (currently) to hours — real time is the ultimate goal. Without improvements in processing speed, virtual sample sets are in danger of remaining too small or too slow to be usable.
The challenge of staying aware of the state of the art in data processing and artificial intelligence… large OEMs are interested in AI in the broadest sense — self-driving, handling customer data and improving business efficiency. The first challenge is the data itself. Not only will the car become a massive source of data, but much of it does not easily fit into existing data structures — images and video are more complicated and unstructured than traditional inputs. With images, it may be necessary to pre-process and store the resulting output, rather than the image itself, to reduce storage space and retrieval time. Image capture and object recognition as a definite area where more work is required and machine learning is already relevant, for instance recognising brands of truck trailer may help build broader recognition of what a trailer looks like. By studying a whole range of machine learning activities (a huge resource undertaking), organisations can develop an understanding of the best fit between problems, data collection methods and analysis tools.
There are different ways of obtaining image data in real time… dedicated chips can translate lidar traces (compatible with multiple lidar types) into an instantly available augmented image. This allows object identification from the raw data and for less expensive lidar units to be used. Examples showed a 16-line lidar unit being augmented for higher resolution.
Machine learning already has applications in ADAS feature sets… it has been put to use in two frequently encountered highway situations: roadworks and other drivers cutting in and out. Video and radar example data was combined with machine learning and human guidance about acceptable limits of driving behaviour. Interestingly, in both cases although the machine learning was given multiple data inputs, only a few key elements were required to provide very good accuracy in use. This reduces the sensor inputs and complexity of processing. For example, machine learning identified a high correlation between the angle of the vehicle in front and whether it was intending to cut in, in preference to more complicated rules combining relative speeds and side to side movement. Multiple sensors should be used fordecision making: although a camera is better for monitoring many of the situations, its limited field of view means that radar needs to be used in camera blind spots.
The car will become one of the biggest generators of natural language data… and its proper use will enable manufacturers to create a personalised experience that the customer values. For relatively complex commands (“when X happens, then please do Y”), contemporary techniques have 95% correct transcription of what the customer is saying and mid-80s% task completion. This is encouraging but shows further development is needed. OEMs will also have to create ecosystems that allow them to control the customer experience inside the cabin, yet are seamless with the personal assistants the customer might have on their mobile phone or home speaker system.
New techniques are improving image recognition… Using industry benchmark tests, computer image recognition is now superior to humans. In some specific use cases this already has practical uses, for example a smartphone app that assesses melanomas. However, at around 97% correct identification of a random image (versus about 95% for humans), improvement is required. Different methods are being tested, with greater progress on static images than video; partly due to difficulty but also because video has less training data: smaller libraries and fewer labelled categories. Video identification accuracy can be improved by running several different methods in parallel. One of the most promising approaches is turning the video into a set of 2D images with time as the 3rd dimension — a technique pioneered by Deep Mind (now of Google). Combining this process with different assessment algorithms (such as analysing the first and nth frame rather than all frames), teams have achieved accuracy of nearly 90% for gesture recognition. Generally, late fusion (a longer gap between frames) gives better results than early fusion — there is variation in what combination of processing algorithms yields the best accuracy. Progress is happening all the time. New ways of addressing machine learning problems sometimes create step changes, so improvement may not be at a linear rate.
It is hard to create different virtual environments for vehicle testing… Using video game tools and very experienced developers, near photo realistic models can be created, but this appears to be the easy part! Because the structure of computer graphical data is different to real life, models need to be adjusted to create the correct type of artificial sensor inputs. This is even more challenging with radar and lidar input data as the model must accurately simulate the “noise” — a factor of both the sensor and the material it is detecting. Perfecting this could take several years. More useful immediately is the ability to create virtual ground truth (e.g. that is a kerb) that can serve SAE Level 2 development. Because L1/2 inputs are more binary, sophisticated sensor simulation issues are less relevant. Researchers believes that a virtual environment of 10km-15km is sufficiently to assist development of these systems, assuming the ability to impose different weather conditions (snow, heavy rain etc).