Archive for the ‘Cyber-Physical Systems’ Category

Should we go back to mechanical systems to ensure the safety of cars?

March 10, 2010

A reporter asked me yesterday whether there are any ‘independent’ organizations or groups that test the safety of car electronics.   An independent entity would be one that does not work with any carmaker, automotive supplier or plaintiffs in a car accident.   Unfortunately, I had to answer ‘No’.   The reason for this absence of independent entities who can offer “unbiased” feedback is simple: how will they support themselves?   Automotive electronics is complex; one needs the services of experts in mechanical engineering, electrical engineering, control systems, electronics hardware, embedded real-time software, fault-tolerant systems, sensors, actuators,  EMI and ESD.  There are hundreds of models sold *every* year.   The cost of sustaining such a testing operation will be enormous, and unless one has a service contract with one of the automakers, or looking at specific issues for a plaintiff, it is very difficult to sustain the operation.  Let’s look at the landscape and how we can help the situation.

(more…)

Advertisements

Is ESD (Electro-Static Discharge) the Culprit behind Toyota ETCS-I?

February 23, 2010

Note: This post is a subset of a posting titled “Possible Electronics Causes for Sudden Unintended Acceleration“.

Electro-Static Discharge (ESD) Issues

Electrostatic discharge (ESD) is the name given to the sudden and short-lived electric current that flows between two objects at different electrical potentials (voltages) caused by direct contact or induced by an electrostatic field. These currents, while short-lived, are unwanted that may cause damage to electronic equipment.

The simplest ESD example that people see in practice is the very brief spark that happens when during winter you  touch a metallic object, get a ‘shock’ and see a spark. (Charge develops on your body as you walk across a carpet, for example, and it gets discharged when you touch a conducting material).   More than 1KV can be generated albeit for a very short time!  At home, one often uses voltage surge suppressors to connect sensitive electronics like TVs and computers to wallpower.  Without such surge suppressors, voltage spikes like lightning can enter your electronics and cause permanent damage. Lightning, therefore, is another classic example of ESD.

Due to the damage that ESD can cause to electronics, there are military, industrial, automotive and international standards to deal with the issue. Popular consumer electronics like camcorders, mobile phones, and digital cameras have built in voltage shunts that trap these spikes from reaching the core of the electronics and damaging them.

ESD damage to electronics, which worsens over time, can fall into different categories. One, the damage can be permanent and the device fails. This is often referred to as a hard fault. Two, the damage seems to reset itself and function correctly (for a while) when the device is shut down and restarted. This is often referred to as a soft fault. Some standards even define finer distinctions.   SSUA on Toyota vehicles seems to correlate to soft faults in many cases, but in cases where a vehicle was totally wrecked, the problem could have been a hard fault.

Possible Electronics Causes for Sudden Unintended Acceleration

February 23, 2010

Sudden unintended acceleration has occurred in disproportionate numbers (relative to market share) on Toyota vehicles since the introduction of the ETCS-I (Electronic Throttle Control System with Intelligence) in 2001 across multiple popular Toyota models.

Many complaints in the NHTSA database clearly indicate that in several cases, the accelerator pedals were not stuck in a floor mat when SUA occurred. The recent fix for sticky pedals that Toyota recently announced also seem to be just a red herring in this whole context.

The timing of the surge in SUA complaints and the lack of other causes points the finger directly at electronics, namely ETCS-I.

Nature of Sudden Unintended Acceleration

While the term sudden unintended acceleration (SUA) has been used quite commonly for months, a better term to use would be sudden and sustained unintended acceleration (SSUA), i.e. acceleration continues in a sustained fashion often leading to high, unsafe and unstoppable speeds.

Once SSUA happens, vehicles have been totaled in a wreck or had to be stopped by putting the transmission into neutral. After reset, in many cases, the vehicle runs normally. However, SSUA can occur again in the future.

Possible Scenarios

  1. The root cause of SSUA is not in the ETCS-I at all – this seems very unlikely given the above data.
  2. A root cause of SSUA lies in the ETCS-I, and Toyota could not repeat the problem in the lab. While the problem may happen under complex or non-obvious conditions, this would likely imply inadequate testing on the part of Toyota engineers (for not being able to think out of the box to find a lurking problem over several years) and/or insufficient investment of resources.
  3. Problems in the ETCS-I were indeed diagnosed in the lab or in the field when Toyota tested vehicles which exhibited SSUA. This would be a major surprise since Toyota has claimed many times in recent years that electronics was not a cause of SSUA. (The President of Toyota USA is expected to make this claim today in the ongoing Congressional hearings). If documents are found that Toyota found some problems in electronics but chose not to disclose them, this would naturally reflect very serious technical, procedural and corporate culture problems within the company and one hopes would not be the case.

Toyota, in fact, announced a few hours ago that brake overrides will be added on more Toyota cars than (quietly) announced earlier. The move is described as intended to “provide an additional measure of confidence”. Lawyers pursuing class action lawsuits against Toyota could be expected to argue that this is an indirect acknowledgement of problems with electronics.

Electro-Magnetic Interference (EMI) Issues

Can EMI (electro-magnetic interference) cause SSUA?

EMI (also called Radio Frequency Interference or RFI) corresponds to disturbances arising from electromagnetic conduction or electromagnetic radiation from a source external to the system under consideration. When your TV reception is poor (e.g. ghost images) during thunderstorms, that can be attributed to EMI. If your cordless phone or WiFi (wireless network) does not work if a nearby microwave oven is running, that can also be at attributed to EMI.

EMI from Sources External to the Automobile

There has been speculation that EMI from sources such as auto-wash houses and big restaurant ovens affect automotive electronics. The intensity of these signals drops (at least) as a square of the distance from the source. In other words, the strength of the interference decays very rapidly and is unlikely to be the source of sudden and sustained acceleration where the vehicle has traveled several hundreds of meters past the source.

EMI from Sources Internal to the Automobile

Interference from within the automobile can in principle cause problems as well. Ford recently noticed EMI from two neighboring wires causing problems in the Ford Focus Hybrid braking system. A software error that saw such EMI decided to transfer control from the regenerative brake system to the (traditional) hydraulic brake system. Additional shielding of the cables and a software patch fix the problem.

Can such EMI cause Toyota vehicles to experience SSUA? In principle, yes. But this would require two conditions to hold true: the EMI happens distorting inputs to the electronics and software interprets those values incorrectly.

My personal opinion is that if this were the problem, it is easier to detect than other sources. If software just reacts instantaneously to the fluctuating EMI signals, sustained acceleration will perhaps not happen.

Electro-Static Discharge (ESD) Issues

Electrostatic discharge (ESD) is the name given to the sudden and short-lived electric current that flows between two objects at different electrical potentials (voltages) caused by direct contact or induced by an electrostatic field. These currents, while short-lived, are unwanted that may cause damage to electronic equipment.

The simplest ESD example that people see in practice is the very brief spark that happens when during winter one touches a metal and you get a ‘shock’ and see a spark. (Charge develops on your body as you walk across a carpet, for example, which gets discharged when you touch a conducting material). At home, one often uses voltage surge suppressors to connect sensitive electronics like TVs and computers. Without such surge suppressors, voltage spikes like lightning can enter your electronics and cause permanent damage. Lightning is another classic example of ESD.

Due to the damage that ESD can cause to electronics, there are military, industry, automotive and international standards to deal with them. Popular consumer electronics like camcorders, mobile phones, and digital cameras have built in voltage shunts that trap these spikes from reaching the core of the electronics and damaging them.

ESD damage to electronics, which worsens over time, can fall into different categories. One, the damage can be permanent and the device fails. This is often referred to as a hard fault. Two, the damage seems to reset itself and function correctly (for a while) when the device is shut down and restarted. This is often referred to as a soft fault. Some standards even define finer distinctions.

Hardware Issues

Can any problems in Toyota throttle electronics lie in hardware?  Here are the possibilities:

  1. There are no logical errors in the hardware particularly when all goes according to plan (such as no EMI, no sensor and Electronic Control Unit failures). It is very likely that this situation reflects the vast majority of cases. In practice, Toyota could not successfully ship a faulty ETCS-i since around 2001 if the design was behaving in
    faulty fashion across most of its models.
  2. Hardware component failures, when they happen, lead to unexpected outputs (causing SSUA). Safety-critical systems such as electronic throtle control systems are supposed to have built-in fail-safe mechanisms. A designer must make assumptions about what could fail and how the system will react to it. See my earlier posting on
    Brake Overrides: The Devil in the Details for additional details. The lack of brake overrides in the ETCS-I is an issue that Toyota has to deal with for quite some time. Its promise to add brake overrides to recent models is a good step in the right direction but leaves open the question “What about the older models?”. Another
    question that applies to all models (even those with the promised overriddes) is the lack of a (say, mechanical) over-ride mechanism that does not depend on the ECU (Electronic Control Unit) inside the ETCS-I to process and execute the over-ride requirement.
  3. Sensor failures (throttle position sensors, for example) can also cause ETCS-I to believe that all is well with the engine control when it is not.  If the throttle position were fully open in reality, but the throttle position sensor reports that it was nearly or fully closed, the engine could experience a surge in acceleration as the throttle is commanded by the ECU to open more and more.   Similarly, if the accelerator pedal position sensor (APPS) were malfunctioning and it reports that the gas pedal was pressed down when it actually is not, SSUA would be a natural result.

Software Issues

Could software in the ETCS-I have problems?

It would be very hard to prove that there are no software problems – such verification technology for the complex situations that ETCS-I can encounter (including noise and failures) is not very mature yet. Those outside Toyota can only conjecture. Only those with access to the source code of the programs running on the ETCS-I can make
more precise statements.

One or two lines of code can in principle do the wrong thing under a complex set of conditions that happen in the accelerator pedal position and throttle position sensors, combined with internal context.

Summary

In principle, many things can go wrong in Toyota’s ETCS-I due to many factors including electro-static discharge (ESD), electro-magnetic interference (EMI), hardware or software.  We are currently focusing on ESD as the likely source of soft and hardware faults in the electronics that can lead to SSUA (sudden and sustained unintended acceleration).   Individual or combinations of these elements can be causing the problems.

Brake Overrides: The Devil in the Details

February 19, 2010

Why Toyota does not install brake overrides of the throttle control system on all their recalled cars and future models is a topic of considerable discussion.   Toyota has indeed announced that they will have such ‘smart brakes’, which prioritize braking actions over throttle actions’, in future models.  In addition, some but also past models will be upgraded with a software patch.

Let’s look at some details of what this means.  In a fully throttle-by-wire system like Toyota’s ETCS-i, the override mechanism can be made completely electronic as well.  If the brake pedal is pressed, that information can be communicated to the throttle control system (using a message over a wire/communication bus), which in turn can close the throttle in response, and independent of the position of the gas pedal.   Cost: a software update.  But, this solution makes several assumptions:

  1. There exists a communication medium between the braking system and the throttle control system.   This may or may not be true.
  2. There exists enough bandwidth/message slots in this communication medium that do not disrupt other messages.  Probably true, if the medium by itself is present.
  3. The ECU (Electronic Control Unit, i.e. computer) and software are all functioning correctly so that the brake override message is both received and processed correctly.  If the root of the problem is that the ECU and/or software has failed in some way, the override will not work.  However, an electronic override would be better than having no override at all.
  4. The throttle mechanism continues to work correctly under electronic control.  This aspect too would depend on whether the root of the problem lies at the interface between the ECU and the throttle control.

An override mechanism that is better and stronger than the electronic mechanism would be mechanical in nature.  In this solution, when the brake pedal is pressed, it mechanically (or electro-mechanically) pushes the throttle to close.    The link from the brake pedal to this mechanical override could be mechanical or electronic (but must be completely independent of the throttle control electronics).

Yes, this superior fail-safe alternative would be costlier. Nevertheless, such independent fail-safe mechanisms should be deemed necessary in all future vehicles particularly for the safety-critical subsystems for acceleration, braking, steering and transmission control.   Using laws of physics (such as gravity or Maxwell’s Laws for Electro-magnetism) so that the fail-safe mechanisms will be guaranteed to kick in is a ripe area of innovation as electronics and software take on more and more functionality in automobiles.

Electronics Failures

February 19, 2010

Here are two excerpts of vehicle speed problems from two complaints about 2009 Toyota Camrys:

She proceeded through a traffic light at approximately 5 MPH, the vehicle accelerated to a higher speed. [NHTSA ODI ID 10306379].

The [driver] was driving at 60mph when his vehicle jerked and accelerated at a high speed. [NHTSA ODI ID 10304646].

Neither of the above complaints seem to indicate that the pedal was stuck in a floormat, or the pedals were sticky and not coming back.   In fact, there are 61 complaints regarding the 2009 Toyota Camry alone in the NHTSA ODI database in the category of vehicle speed control.   There are 30 complaints in this category for the 2009 Toyota Corolla.

In comparison, the 2009 Chevy Impala, Chevy Malibu and Ford Fusion have 0 (zero) complaints in this category!  The Nissan Altima and Hybrid have 6 complaints, and the Honda Accord has 3 complaints.  The obvious question is whether these numbers are unbalanced because Camry outsells them all.  In the first half of 2009 – for which numbers were available, the Toyota Camry with 150,242 US units did outsell the rest but not by much over the Honda Accord (118,459), the Nissan Altima (96,428) or the Ford Fusion (85,146).   The 2009 Toyota Corolla sold 121,643 units in the US during the first half of 2009.   In other words, the Accord outsold the Corolla and had much fewer vehicle speed control problems.  (Please look for a future post where disproportionate numbers for Toyota can be seen on other model-year vehicles as well.)

While floor mats have been an issue and sticky gas pedals do seem to happen, it clearly looks like there are sudden acceleration issues going beyond either of these causes on these Toyota cars (all of which share the ETCS-i).   Unfortunately for Toyota, their most recent recalls will not be the final word on throttle control problems – the electronics issue needs to be resolved.

Competitiveness in the Automotive Market: A New Future!

February 17, 2010

Toyota’s manufacturing prowess is legendary in the automotive industry and even beyond.   Their philosophy of continuous improvement (called “Kaizen”) has been adopted (or at least attempted or considered for adoption) by any large-scale manufacturer.   Specifics like the empowerment of any factory employee to pull a cord (called the “Andon”) and stop an entire assembly line when a defect is noticed are also emulated by many.

Two global trends are likely to dramatically alter this competitive edge that Toyota has enjoyed for some decades now.

  1. First, by emulating Toyota as best as they can, the Detroit Three and South Korean carmakers have been able to narrow the quality gap over the years.
  2. A modern automobile is increasingly a networked computer platform on wheels.  By 2015, roughly a third of an automobile’s value will be in its electronics.   As some of my earlier posts emphasize,  electronics, despite their potential occasional glitches, is what makes modern cars better in terms of fuel efficiency, safety, emissions and other functions.  We are not going back to the past.   But, electronics and software design is not just a manufacturing issue; an andon cord in the assembly line will not detect an electronic design problem or a software bug.   The decisions that get made months and years ahead of the assembly process are what make the difference.    As of now, no car company including Toyota has a major lead on this underlying technology (of building s0-called cyber-physical systems that automobiles increasingly are).   Toyota does not have a technology lead on software development for these systems either.   Any advances from a Toyota supplier are also easily accessible to other carmakers because of a global supply chain in today’s interconnected economy.

In other words, the future of automotive competitiveness is wide open from a quality/technology perspective.   Anybody can get ahead in this race.   Sound and strategic investments to both develop new electronics/cyber-physical systems technologies and train engineers/developers on these technologies are what will make a difference.

Let the race begin, and let the best carmaker(s) win.

Your Automotive Black Box and Your Right to Access It

February 17, 2010

Yesterday’s Wall Street Journal has an article on black boxes on cars from the Japanese and the Detroit Three.     The “black boxes” store key vehicle attributes such as brake position, throttle position, speed and acceleration for the past few seconds of automobile operation.   Toyota’s black boxes store 2 seconds of limited information and, in addition, the data in those boxes can only be retrieved and interpreted by Toyota.   In contrast, the data from the black boxes on the Detroit Three can be accessed by the consumer.   Very interesting!  If only the Toyotas stored data for a few minutes of operation, and the data were accessible directly by consumers, the ongoing debate regarding whether sudden and sustained unintended problems were due to Toyota’s electronic throttle control systems would be settled very quickly one way or the other.

One good outcome of next week’s congressional hearings could be the following.  DoT regulations to have automotive black boxes store data for substantially longer than 3 seconds are required to be installed soon (rather than in 2013).   I would say that the black boxes should store at least the last 15 minutes of operation; memory is cheap, particularly when one considers the safety implications.   One must note that such black box data actually helps carmakers in exonerating their systems in case of driver error.

Add the right to access data in your car’s black box reader to the list of consumer rights…

The Safety Engineer’s Dilemma

February 16, 2010

The recent recalls of millions of Toyota vehicles is shining, in part, a spotlight on the safety features of modern electronics systems.   The designer(s) of a safety-critical system can certainly “over-design” the system with several fail-safe mechanisms and egregious amounts of redundancy that are also accompanied by exhaustive/extended testing.  This could however be self-defeating in that the resulting system can become

  1. Too complex having its own inherent failure modes and also increasing exponentially the number of tests that need to be carried out.
  2. Too expensive and therefore unaffordable and impractical.

Conversely, if one puts in too few fail-safe mechanisms and too little redundancy (which will probably be only known in hindsight), the system could become vulnerable to non-negligible non-zero failure rates.  What is a safety system engineer to do?  This is the dilemma that a safety engineer faces.

I see two ways out:

  1. Imagine a nuclear power plant – the system must necessarily be “over-engineered” due to the catastrophic implications of system failure.  However, fortunately, it would appear that the benefits of cheaper energy make the end-system affordable, acceptable and practical (in at least some countries).   Storage of spent fuel seems to be a pending issue, and again over-engineering of storage containers and regions would appear to be the solution.    In other words, these are systems whose benefits are large compared to the costs of such safety-critical systems.
  2. In the automotive context, the end-system (the automobile) must still be affordable, which of course is a relative term.   The more advanced systems with at least a touch of over-engineering gets into high-end models first, and then slowly migrate to the low-end models.   Extensive data collection and conscious tracking of various tradeoffs made during design ought to be an integral part of the process.

Remember that the electronics and its associated software are what gives us the power, sophistication, features and flexibility we want.  Also, one must note that mechanical elements have their own failure rates as well (think worn out brake pads, broken or cracked metal shafts and even battery failure).

Toyota Revisiting Electronics Throttle Control

February 16, 2010

The Wall Street Journal reports that a preliminary study carried out by Exponent showed that “”Exponent had been unable to cause sudden acceleration by making electrical disturbances to Toyota vehicles’ electronics systems”.   It is not clear from this report whether the disturbances were generated from external sources or not.   I tend to think that external disturbances are unlikely to cause and leave a throttle stuck to create sudden and sustained unintended acceleration (SSUA).   (The Ford situation described in a previous blog posting relates to internal disturbances to which the Ford braking system reacts by switching from regenerative braking to hydraulic braking – a software glitch that Ford has promised to fix).

According to an AP article, a spokesperson said that Toyota “does not think there are any electronics problems with its vehicles, but promised to look into it again”.    Another article from Reuters notes that, according to the Yomiuri paper from Japan, “Toyota … aims to demonstrate that there are no problems with the systems with help from external experts”.

I like the fact that this quote uses the word “demonstrate” and not “prove”.   How would one “prove” that such systems have no problems?      One must point out that designers of complex systems always aim to bring failure rates to an acceptably low level and not to zero.   While some members of the public (or the media) may react with panic at the notion of non-zero failure rates where lives may be at stake, that is how all systems in practice behave.  One can imagine a scenario where basic assumptions do not hold.  As long as the failure rate is below a very low threshold, the benefits can outweigh the risks.  One also puts in multiple levels of backups, so that damage if it occurs is dramatically reduced.

Is faith in automotive electronics and software misplaced?

February 10, 2010

Given the high-profile coverage of Toyota’s recent recalls, we are beginning to see increased focus on the safety of electronic systems in cars.  For example, see the article “Your Car Computer Can Kill You“.  Indeed.  Are we putting too much faith into electronics and software that control our cars?

Let’s take a step back and look at the bigger picture.

Fly-by-wire systems transformed the aviation industry, so much so that for trans-continental and inter-continental flights are flown for the most part by an auto pilot.   Engines, ailerons, landing gears, and other equipment which used to be mechanically coupled are all controlled “by-wire”.  (Mechanical linkages are replaced by wires that carry signals from pilot interfaces to the actuators that control the plane).    Three glitches of the Boeing 777 are  noteworthy and illustrative.

  1. The first was on August 1, 2005, when a Malaysia Airlines 777 over the Pacific thought that it was both flying too fast and too slow!  The autopilot pitched the plane’s nose upwards trying to stall the plane.   Then, when the pilots brought it back, it started pitching upwards again.  Eventually, the pilots regained control and the software had to be fixed.  An accelerometer sensor reading was suspected to have played a role.
  2. In January 2008, a Boeing 777 crash-landed at the Heathrow Airport when both its engines seemed to fail.    Subsequent investigations indicated that there had been ice in the fuel causing both engines to lose power.
  3. In January 2010, a flock of birds hit an Airbus 320 taking off from LaGuardia airport in New York, and Captain Chesley Sullenberger was widely applauded for his masterful landing of the plane on the Hudson river.  Nobody was killed.
  4. In June 2009, an Airbus A330 disappeared over the Atlantic in a thunderstorm killing 280 people.   Short circuits and an incorrect speed sensor were both suspected but no confirmation could be obtained since the wreckage could not be located.

Until the first of the above two incidents, Boeing had put more than 600 planes of its 777s into service and not one had crashed.   A remarkable track record indeed.    Sensor malfunctions and ice in the fuel are not electronics problems per se – they would have affected mechanical systems equally or even worse.   In the “Hudson Miracle’, in addition to the captain’s equanimity and superb execution under pressure, the fly-by-wire system also deserves a good amount of credit for keeping the plane controllable.   In the last incident, incorrect sensor readings would be devastating to a human pilot in control as well.

Fly-by-wire systems have what made intercontinental flights what they are today: comfortable, fuel-efficient, and affordable.    We cannot and will not go back to mechanical systems.  And fly-by-wire systems have indeed stood the test of time.  Given intense competition between Boeing and Airbus at the mega-plane level, and many other players in the smaller jet segment, competition continues to be strong.  Quality and reliability will only go up (sometimes in stutter steps, sometimes two steps forwards and one step backwards.   And yes sometimes one step forwards and two steps backwards as we understand operating limits, working assumptions, and system deficiencies better).

(more…)