O/T - MEMS3 Td5 Emulation – Emulating a Td5 ECU on Petrol ECU Hardware

3 posts / 0 new
Last post
revilla
Offline
Last seen: 20 min 7 sec ago
Joined: 17/04/2014
O/T - MEMS3 Td5 Emulation – Emulating a Td5 ECU on Petrol ECU Hardware

A bit off topic this one, but builds directly on all the stuff I've done on Petrol MEMS3 and I though some people might find it interesting ...

Download Link: http://andrewrevill.co.uk/Downloads/MEMS3Tools.zip

Motivation

Ever since I released the first version of MEMS3 Mapper, I’ve had more requests for help from Land Rover Discovery and Defender Td5 owners than from Rover K Series owners. This came as a bit of a surprise to me at first as when I was writing it, I had no idea that MEMS3 was also used on the Td5 diesel engine. The requests I get tend be a bit different to those for Rover K Series ECUs. Farmers in South Africa are more interested in disabling immobilisers that could strand them miles from anywhere in rough country that they are in tuning, but a lot of diesel owners are also adding high pressure turbochargers etc.

It turns out that the ECUs are different, but very close cousins, and most of the basic functionality of MEMS3 Mapper did indeed just work with the Td5 ECU, but since then I’ve developed a lot of features that go beyond standard communications with the ECU and while some of these worked on Td5 ECUs, some did not, some were just not relevant, and some commands actually had a different meaning so were actually dangerous to run on a Td5. My approach with these until now has been to disable them on anything other petrol ECUs. Whilst petrol MEMS3 ECUs can be had for as little as £5 on eBay, Td5 ECUs are usually in the £250-£500 bracket, and I knew I would be quite likely to get through a few of them before I was done, which just didn’t seem justifiable when I wasn’t going to get any benefit out of it for myself. There was also the issue that I didn’t have an engine I could run anything on, but for most of the features this wasn’t a major issue; so long as I had an ECU up and running on the bench I would have all of the communications and programming side working to play with.

Recently I’ve had more and more enquiries about using the newer features with Td5s, so I decided to see what I could do about it.

All I needed was a way of running a Td5 boot loader, firmware and map on any system that would support it, with a compatible communication port to talk to.

Updating Boot Loaders

One of the more recent developments was the ability to replace the boot loader on a running ECU, without having to dismantle it and remove the main ROM chip. I’ve actually now got two methods to do this. The first method erases the firmware, writes the new boot loader into the unused firmware space and then uploads a program into RAM and executes it; the RAM program erases the boot loader and copies in the new one, before wiping the firmware area again, leaving the ECU clean with the new boot loader, ready for normal programming. This is the quickest way to update the boot loader (the boot loader itself is relatively small), the safest way (the boot loader can be verified in memory before installing it) and allows the user to be prompted not to power the ECU off at the critical moment, but it does rely on executing code in RAM and that was one of the advanced features that was disabled on Td5 ECUs for now and would take some research one a running ECU to get it working, which left me in a bit of a chicken-and-egg situation. The second method consists of a “normal” firmware and map file; but instead of actual firmware, it contains a copy of a new boot loader and the update program dressed up to boot in the way a firmware would, so after writing, when the ECU tries to boot up on the new “firmware” it actually executes the boot loader update program which then proceeds as before. As far as the ECU is concerned, this is just a regular programming job, so doesn’t include any special protections, but is easy to implement.

My first idea was to just flash a Td5 boot loader onto a petrol ECU and see what happened. I tried this; the boot loader crashed, the ECU would not communicate again and went in the bin!

I double-checked everything, knocked out a few things which I thought might be troublesome, and tried again. And bricked another ECU. I realised this was going to get very tiresome and expensive, so came up with a Plan B.

Plan B

One feature of MEMS3 Mapper is the ability to recover an ECU where the firmware has been bricked (note that this still requires the ECU to be able to boot into the boot loader to be of any use, so was no help in recovering the above ECUs). It does this by sending $96 bytes over the OBDII port as the ECU boots and listening for an $00 reply. I found some escape code in the early start-up of the boot loader which checks for $96 in the serial communications data register as it boots, and doesn’t try to launch the firmware if it finds it. This happens fairly early on in the start-up code; so long as the ECU survives basic hardware register setup, it will reach that check.

So Plan B was to wipe the firmware, upload a copy of a good PETROL boot loader high up in the firmware area at $120000 (where the ECU wouldn’t be looking for a firmware, so would behave as though it was factory wiped and ready for programming), upload a short program that could copy this petrol boot loader into the boot sector (also high up in the firmware area out of the way) then modify a copy of a DIESEL boot loader so that if it detected the $96 recovery signal, instead of continuing to run the boot loader it would just jump directly to the update program in the firmware area.

The idea was that even if the diesel boot loader didn’t get to the point where it could communicate normally, so long as it started the boot sequence I could always intercept it and force it to revert to petrol mode.

I tested this idea with a different petrol boot loader first, just so that I would be able to recover thigs if it went wrong, but it worked. So I tried it with a diesel boot loader and it did what I hoped. The ECU was initially bricked and the boot loader just crashed as before, but if I used the ECU recovery tool and sent it the $96 signal as it booted, it immediately reinstalled the petrol boot loader and booted back up again as a happy petrol ECU.

So now I had a viable way of testing diesel boot loaders without losing ECUs every time.

The boot loader code has a fairly simply high level structure. It starts off setting a load of module control registers to initialise the basic hardware, then it calls a list of subroutines to do more setup jobs, then it goes into a loop which basically consists of calling another list of subroutines one after another, looping back to the start and repeating when it gets to the end, forever. Each of these subroutines handles a different area of the ECU functionality. For my purposes, anything that was actually engine management related, which was operating the engine control hardware, was unnecessary (it was never going to be able to run either a petrol or a diesel engine being a hybrid). I reasoned that the most likely reason for it crashing would be that some of the subroutines, somewhere deep down in the nested tree of calls, would be accessing non-existent hardware. So I started knocking out subroutine calls just to see what would happen. Initially I knocked out all of the calls in both the setup and loop code, and the ECU then booted up and sat quietly without crashing, but obviously it wouldn’t do anything. It was just going round the almost empty loop, resetting the watchdog timer but doing nothing useful; it wouldn’t even communicate as some of the subroutines I had knocked out would have been responsible for managing the communications side of things, but I had the escape route set up so each time I could force it back into petrol mode ready for the next test. One by one I added the top level subroutines back in to see what worked and what didn’t. As I added the calls back in, bit by bit the ECU came back to life. In the end it turned out that there was just ONE subroutine in the setup code and ONE in the loop that were causing the ECU to crash. I have no idea what they were actually doing, but knocking them out left me with an ECU which behaved normally from a communications / programming / diagnostic point of view, so I assumed they were engine management hardware stuff and just left them knocked out.

So now I had a petrol ECU that would boot up on a slightly modified diesel boot loader and would then communicate with me, identifying itself as a diesel ECU and behaving from a communications perspective as a Td5.

Loading Firmware

So far, so good. So now time to write a diesel firmware and map to it to see what happened. Predictably, it crashed again. Well it sort-of crashed, it actually just froze. It didn’t reboot, it just hung up doing nothing. And this wasn’t when running the diesel firmware, it hadn’t got that far. It was when using the diesel boot loader to write the firmware to the ROM. The first bock of bytes I sent to it were accepted and I got a positive response message, the second block didn’t get a response and the ECU was completely hung up. Hmmm …

That’s actually quite an odd response, it’s something the ECU is very carefully set up to avoid. The last thing you want is your ECU to hang up while driving as the engine would just shut down. There’s a watchdog timer which the ECU has to continually reset during normal operation. If it stops resetting the watchdog, it reboots the ECU at a hardware level to allow it to continue. And that wasn’t happening. So the ECU must be stuck in a loop where it was EXPECTING to spend some time, and was therefore resetting the watchdog timer in the loop.

I looked at the code used to write bytes to the ROM, which is actually a fairly time-consuming process. It worked like this:

1.      It performed a setup sequence of “writes” on the ROM which acted as a signal to tell it that it was about to write data.

2.      It wrote the data.

3.      The ROM spent some time writing the data. While it was busy writing, any reads from the ROM from any address would actually return a status register instead. The high order bit D7 of the status register contained the INVERSE of the data written; do one way of working out when it was finished was to keep reading and looking at D7. If it was the opposite of what you wrote, it was still busy and you were reading the status register. If it contained the bit you wrote, the write was complete. Bit D6 gave another way of telling when the write was finished. That toggled between 1 and 0 on each successive read while it as busy, so if two ready in a row gave the same value, the write was finished.

4.      The ECU code used the D7 method. It waited until it saw the value written. But this is the less safe method of the two, in that if for some reason the write didn’t actually start, it would immediately start reading the data in the ROM at the programmed address, which would be $FF as the ROM would previously have been erased. So the high bit would be a 1. If the high bit of the byte written was a 1, it would assume the write was finished. If the high bit of the byte written was a 0 though, it would assume that the write was still in progress and it was seeing the status register, and would wait forever, resetting the watchdog timer.

My guess was that this was what was happening, so I thought of a simple way to test it. If I wrote a modified file where the high bit of every byte was set to 1, it should get through the write without hanging up. It did. If any one byte had a high bit of 0, it should freeze at the point of trying to write that byte. It did. So that was fairly conclusive.

So, for some reason the ROM was not accepting the write sequences correctly, and was not actually starting a write cycle.

I set reading the datasheets for the ROM and CPU chips and looking at the code and worked out:

1.      The ROM can be configured to behave as 256 K of 8-bit bytes or 128 K of 16-bit words. There’s one pin that configures the ROM in word or byte mode. It is normally read in word mode as the CPU bus is 16 bits wide, but the programming sequences used matched those for byte mode, so the ROM was being accessed quite differently for reading and writing. Reading in word mode, writing in byte mode.

2.      This meant that different chip select electronics were being used to address the ROM in normal read mode and in programming write mode. The chip select signals for the ROM were being controlled by MCU registers CSBARBT and CSORBT (Chip Select Base Address Register – Boot ROM) and CSORBT (Chip Select Options Register – Boot ROM). A bit of poking around the circuit board with the continuity tester on a multimeter found that the pin that controlled the word or byte orientation for the ROM was connected to pin CS2 on the microcontroller (Chip Select 2) meaning that registers CSBAR2 (Chip Select Base Address Register 2) and CSOR2 (Chip Select Options Register 2) came into play, and sure enough before writing to the ROM I had previously seen that the code always changed CSOR2 from the default value of $1031 set by the boot loader to $7031, and reset it when finished. Furthermore I had previously noted that when writing to the ROM, the code always copied a subroutine into RAM and executed it there; I had never quite understood why, it made sense when overwriting the area where the code was in ROM but this wasn’t happening as it was always boot loader code overwriting firmware or map areas, but now it made sense. Changing CSOR2 form $1031 to $7031 puts to ROM into byte oriented mode for programming, and in this mode it can’t properly execute ROM code which requires the CPU to access it by word, so it copied the code it needed to execute to RAM, put the ROM into byte mode, programmed it, and put it back into word mode before returning. Clever!

3.      Presumably due to differences in the peripheral electronics and the way the ROM was wired to the MCU in the diesel ECU (I still haven’t had my hands on one in pieces to investigate), the default value $1031 by the petrol ECU was $BB71 in a diesel ECU (I do have lots of copies of diesel boot loader and firmware files to look at).

So the next step was to take the modified diesel boot loader and change all the places where it set CSOR2 to $BB71 back to $1031 to see if it worked on the petrol hardware. It worked!

So now I was able to boot a petrol ECU up on a diesel boot loader and upload diesel firmware and map files to the ECU. But as soon as it tried to launch the diesel firmware, it crashed again.

Running Firmware

The problem here was basically the same as the original problem with the boot loader. There were a couple of subroutines which accessed hardware that wasn’t there that were causing it to crash, and by a similar process of elimination I was able to knock those out and stabilise the ECU. Again, none of the features I needed in order to configure MEMS3 Mapper seem to have been lost, so those routines seem to be handling engine management hardware that I don’t need to worry about.

That all got me to the point where I could load up a full set of (slightly modified) diesel files on a petrol ECU. The ECU was relatively happy, ran the code normally and communicated with normally AS A TD5 ECU. Which meant I could start testing all of the features on MEMS3 Mapper on diesel ECUs.

Basic read and write operations all worked fine now. I could read firmware and maps, use the mapping features on MEMS3 Mapper to modify maps and write them back. But all of this was known to work on standard Td5 ECUs already, so it was really just a confirmation that my Td5 Emulation setup was behaving like a regular Td5 ECU.

Graphical user interface, table

Description automatically generated

Custom Code Execution

The next thing I tried to test was the mechanism I had designed which allowed me to upload arbitrary code to RAM and execute it, as this was the basis for a lot of the extended features I had implemented on petrol ECUs. Here I hit two snags. The first snag was easy to get around. As part of my mechanism, I overwrite one word on the stack with the low word of the address of the RTS instruction at the end of the main system loop. This address was different on Td5 ECUs, but all I had to do was to write the correct address, one for petrol and one for diesel, depending on the ECU family selected.

The second snag was rather harder to work around. The above fix got all of the RAM agents which don’t require communications facilities to load and execute. However, any RAM agents that did require communications just crashed the ECU, and the reason didn’t take too much figuring out. The microcontroller in the ECU has two RAM modules, SRAM which is 4kB and TPURAM which is 3.5kB. The petrol ECUs configures these two modules to be at consecutive addresses, giving it a contiguous block of 7.5kB of RAM from address $0000 to $1DFF. The diesel ECUs however configure the TPURAM module to hold custom microcode for the TPU module (Time Processor Unit), and so only had 4kB from the SRAM module available from address $0000 to $0FFF. Because my hybrid ECUs were running diesel boot loader code, they were (correctly) configuring the TPURAM module as diesel ECU would, and so (correctly) only had 4kB of RAM available.

When deploying my RAM agents, the boot loader is running. The boot loader uses the lower portion of RAM for its own variables, so there are limits on what RAM addresses I can write to without crashing the boot loader. Writing to addresses from $0800 upwards seems to be generally safe, which means I can upload up to 2kB of code and data from address $0800 to $0FFF if I want to be compatible with both petrol and diesel ECUs. Because my RAM agents take complete control of the ECU and shut down the boot loader while executing, I can’t use any of the communications facilities of the boot loader, my programs have to be completely self-contained applications and contain their own communications routines, and the communications library I had developed for use in all of my firmware patches was 3.5kB! So there was no way that any of my communications-based agents were going to fit into RAM on a diesel ECU.

Communications Libraries

For a while that had me beaten, but then it occurred to me: The reason my communications library was so large is because it was a fully interrupt-driven, background communication system. This in necessary in firmware patches as it can’t block the normal operation of the ECU. When it wants to send a message, it can’t sit there waiting for each byte to be sent before writing the next byte as the engine would have died in the meantime. Similarly, when checking for incoming messages, these must be processed in the background so the code can just pick them up when they are ready, it can’t sit waiting for a message as this will also kill the engine. None of these limitations applied in the case of my RAM agents. If I sent a program to the ECU in boot loader mode to read the registers for example, the program would have nothing to do other than to sit listening for request messages for blocks of data, or to tell it that we were finished, and replying with the chunks of data requested. So a much lighter-weight solution would be appropriate.

I split my communications library into two layers. The higher-level layer dealt with message structure and protocol, the lower-level layer dealt with the actual process of reading bytes from and writing bytes to the SCI module (Serial Communications Interface module in the MCU, which what is connected to the OBII port). I wrote two different versions of the lower layer, one of which implemented the full interrupt-driven background processing needed by things like the live mapping patch, and one of which was just a simple blocking foreground implementation which could be used in RAM agents. Doing this brought the library size down from around 3.5kB to below 1.2kB in agent form, which made it a lot easier to squeeze the agent code into the space available.

Vector Tables

A self-contained MC68000 program always contains an exception vector table. This consists of 256 4-byte “vectors” which tell the ECU what routines are to be used to handle different “exceptions”, which might be interrupts (when a device needs the CPUs attention to deal with an external event) or faults (arithmetic faults, access to invalid addresses etc.). Most of these vectors are unused, especially in such a simple program as one of my RAM agents, and are set to $FFFFFFFF, but the full table needs to be specified and it takes 1kB. This was again crippling on a Td5 ECU where I only had 2kB writeable as it used up half of my available space.

The solution to this was to realise that once I had taken control of the ECU I had the full 4kB of RAM available, it was only during upload while the boot loader was running that I was restricted to the 2kB block form $0800 to $0FFF, so if I could send the vector table up in some kind of packed format and then I could unpack it into another area of RAM as my code started up, before enabling interrupts (which needed the table to be in place). There were only a few different possible values for each of the 32-vectors; the first one was the initial stack pointer, the second one the program entry point, unused ones were $FFFFFFFF, there was a default do-nothing handler for all those interrupts set up for the hardware by the boot loader that I didn’t need to do anything about, one for the periodic interrupt timer for handling timeouts and one for the SCI for communications. So instead of sending up 256 32-bit addresses, I could send up 256 4-bit codes, each of which identified one of the above vectors, and then decode these into the actual table at runtime. This would reduce the size of the table from 1024 bytes to 128 bytes. In addition, I noticed that only the first part of the table is really used. The last vector which is non-$FFFFFFFF is vector number $60, vector numbers $61 to $FF are all unused. So if I added one more code to mean “end of table”, I could just assume when unpacking the table that everything after that was $FFFFFFFF without having to actually specify a code for each one. Packing the 4-bit code for vectors $00-$60 plus and end of table mark into 32-bit words meant only 50 bytes to upload instead of 1024.

Word Addressing Mode

Another optimisation was to make sure that my code used word addressing everywhere that it addressed RAM or module registers. The 68000 CPU has a 24-bit address bus, with addresses $000000 to $FFFFFF. These can be addressed using a 4-byte long word (the size of the registers) where the top byte is meaningless and discarded. That means 4 bytes are used for every address in code. But there is a shorthand addressing mode, where you only need to specify a 2-bte word. Word values range from $0000 to $FFFF, and in word addressing mode these are sign-extended. Addresses where the top bit is set ($8000-$FFFF) are treated as NEGATIVE numbers, and the address that they refer to is the corresponding negative number in 32-bits. So $0000-$7FFF refer to addresses $00000000-$00007FFF but $8000-$FFFF actually refer to addresses $FFFF8000 to $FFFFFFFF. As the RAM occupies $0000 to $1DFF (on a petrol, or just to $0FFF in diesel) and the module registers all occupy high addresses $FFFFFXXX, they are all accessible using word addressing mode. This makes the instructions not only faster to load but more importantly in this case, one word smaller in RAM.

Implementing all of these reduced the total size of my largest RAM agent programs down to about 1.2kB, leaving 0.8kB free and enabled me to implement all of the features based on them on Td5 ECUs.

Serial EEPROM

The ECUs had one last trick to play one me. Even after I had all of the RAM agents loading and executing successfully, I noticed there was a problem with the serial EEPROM chip. This is the little memory chip which stores things that change while the ECU is running, like fault codes, immobiliser pairing information, ECU adaptations. I have provided features in MEMS3 Mapper to allow this to be read and written, which allows an ECU to be truly cloned with all of its immobiliser settings, fault codes and history and already adapted to the engine. But reading the serial EEPROM on an emulated Td5 ECU just gave me a file full of $FF bytes, and writing and erasing the serial EEPROM just went through the motions without erroring but didn’t seem to actually do anything.

The explanation for this was very similar to the problem which stopped me from writing to the main ROM early on. The peripheral electronics and the connection between the MCU and the serial EEPROM were different in a Td5 and petrol ECU. Again, I couldn’t prove this by actually looking inside a Td5 ECU as I didn’t have one, but I could work it all out from the code and the datasheets.

The 93C66 serial EEPROM is connected to the microcontroller using an SPI serial bus. This is a synchronous serial bus which may be used to connect a microcontroller to a number of peripheral devices. Peripheral Chip Select signals are used to tell each device whether it is being spoken to. The way in which the microcontroller talks to SPI devices is through its QSPI module (Queued Serial Peripheral Interconnect) which uses 3 blocks of RAM to queue up operations. There is a CR (Command RAM) block which holds a series of 1-byte commands, a TR (Transmit RAM) which holds a series of 2-byte words to be written, and an RR (Receive RAM) which stores the 2-byte words sent back.

The top 4 bits of each byte in CR byte determine what is to be done, whether it should continue on to the next command etc. The bottom 4 bits in each CR byte determine which device it is talking to, i.e. which Peripheral Chip Select signals should be asserted. I noticed that the commands set up by a petrol ECU all had 5 in the lower 4 bits, meaning that PCS2 and PCS0/SS were asserted. PCS2 selects the serial EEPROM (a check on the circuit board found PCS2 connected directly to the Enable pin on the 93C66) and PCS0/SS is Slave Select, which tells the serial EEPROM to be in slave mode (SPI communications are always between a master and a slave, with the master being in charge of timing, in this case the CPU/SPI is the master and the 93C66 is the slave). Comparing the same routines in the diesel ECU, I found exactly the same commands sequences but with 3 in the lower 4 bits of each command byte, meaning that PCS1 was asserted rather than PCS2. So in a Td5 ECU, the serial EEPROM must be connected to PCS1 instead of PCS2.

This needed two things doing to fix it.

1.      The custom RAM agents I was sending up to the ECU to manipulate the serial EEPROM needed two different versions. One for petrol ECUs, using “5” commands and one for diesel ECUs, using “3” commands. In addition, I needed to add a new ECU family to MEMS3 Mapper (which only shows up when you choose to enable developer features) for “Td5 Emulation”. This is specifically for use with petrol Rover K Series ECUs running modified Land Rover Td5 boot loaders and firmwares as described here. It communicates with the ECU as though it were a Td5, but where the custom agents need to be different on petrols and diesels because of physical hardware differences, it sends the PETROL version to match the hardware.

2.      Some pins on the microcontroller chip are dual-purpose and can be programmed to do one of two jobs. The Peripheral Chip Select pins PCS0-PCS2 can also be programmed to be general purpose I/O pins as part of PORT QS. There is a register PQSPAR (Port QS Pin Assignment Register) which programs the pin assignments. Petrol ECUs need to use pin PCS2, but the diesel boot loaders program PQSPAR such that pin PCS2 is not available and is assign to PORT QS instead. So I needed to make some more small patches to the Td5 Emulation boot loaders and firmwares to change the values assigned to the PQSPAR register to match those used by the petrol ECUs.

Making those changes got the serial EEPROM code working.

Fault Codes

The Td5 ECU uses an early, pre-OBDII compliant fault code system. Fault codes are usually given in the form Error 12-06, where the first number (12 in the example) is always between 1 and 35, and the second number (06 in the example) is always between 0 and 7. I managed to find several lists on the web defining the corresponding error message for each possible code, but no definitive description of how the codes were extracted from the ECU. However, I did notice that the error codes were read form the ECU using Service $21 and a specific local identifier value for the error code record, and the response to this message always contained 36 bytes, byte 0 being the record identifier and bytes 1 to 35 being data. So an obvious possibility was that each bit of each byte returned would be 1 if an error was present, 0 if no error was present, and the byte numbers 1-35 corresponded to the first number in the error code and the bit numbers 0-7 corresponded to the second number. So in the example, Error 12-06 would be present if bit 6 of byte 12 of the response was 1.

There were a couple of problems with this interpretation though:

1.      My Td5 Emulation ECUs were returning records with a lot of bits set, many of which did not have corresponding descriptions in the lists I had found, and for which I was just displaying “Unknown Error”.

2.      I had another tool available which claimed to support Td5 error code reading, bit its interpretation of the error codes was clearly very different. So one of us was wrong!

In the end I decided that the evidence in favour of my interpretation was so strong that the best thing to do was go with it, then test it on a real Td5. I had someone who had kindly volunteered to test all of this on his own Discovery Td5 once I’m happy with it on emulated hardware. My hunch was that all of the spurious error codes I was seeing were just a consequence of running on the wrong hardware; the ECU was reporting faults it was finding with internal hardware which would almost never arise in a real ECU and had not therefore been catalogued in the wild.

RAM Corruption

One thing that I did notice while developing all of this was that, just very occasionally and almost always on a Td5 Emulation ECU, one of my RAM agents would go a bit haywire. Looking at RAM dumps from petrol and diesel ECUs, it’s clear that the memory map of the Td5 ECU isn’t as clean as the memory map of the petrol ECU. When the ECU boots, one of that tasks the boot loader performs is to test all of RAM by writing different bit patterns to each byte and reading them back again, and when it has finished it leaves all of RAM set to $00. If you dump the RAM of a petrol ECU running the boot loader, what you see is a clearly defined cut-off above which the RAM is all still $00. RAM above this address has just not been used, but RAM below this address is generally full of what looks like real data. On a Td5 ECU, there is still a cut-off but there are occasional non-$00 bytes above this address. So there isn’t a solid block where you can for sure “that’s not used”. It does work in the vast majority of cases, but I think very occasionally the ECU overwrites something I’ve uploaded. To protect against this, I’ve now added a checksum to all of the RAM agent programs I upload. At the start of the main entry point code, after it has shut down the boot loader (is executing its own code, has reduced all interrupt priorities to 6 or less to ensure they can be masked, set the priority mask to 7 to disable all interrupts, so there is no reason why any native ECU code should ever execute, so no reason why any more bytes might be overwritten) I calculate the checksum and only proceed with execution if it is correct. If not, I reset ECU, which causes the PC to retry the operation. This always seems to work OK. I suppose that there’s a really tiny chance that the checksum code itself could get corrupted in such a way that it allows the execution to proceed even though damages, although the chances of that are so small I’m not going to worry about it. It’s almost certain that a random corruption of the code caused by overwriting of instruction with data will lead to an invalid instruction exception or an address fault, which are set in the vector table to trigger and ECU reset anyway, or a tight loop which will trigger the watchdog mechanism so the effect is the same. The ECU architecture is very heavily protected against faults, and anything that leaves it running anything other than the code it was designed run in an orderly fashion just lead to the watchdog timer restarting it. I’ve NEVER seen it need to retry more than once, but the PC-end code provides for up to three retries before it gives up.

Td5 Emulation Library

I’ve now made a full library of all of the Td5 boot loaders and a good selection of Defender and Discovery firmware and map files, with all of the necessary patches applied to allow me to run them on petrol ECUs for the purposes of developing MEMS3 Mapper support for the Td5 ECU. I’ve also written a program that searches for and applies all of the above patches, so if I find any more patches that are needed or I want to add more firmware and map files to the library I can just re-run the program to apply the patches across the full set of files.

State of Play

Well today I heard back from Sergei who has been testing all of this on his Discovery Td5, and I’m pleased to report that it all worked correctly first time of trying. My Td5 Emulation ECUs were clearly a faithful enough representation of a real Td5 ECU for me to base the development on, and my interpretation of the fault codes was clearly correct. On his vehicle there were a few fault codes showing, but they were exactly as expected and due to some modifications and conversion he had done.

So, where have I got to now? MEMS3 Mapper now basically works with T5D ECUs. Most of the custom features, including read RAM, reading boot loaders, reading, writing and erasing the serial EEPROM, updating the boot loader, erasing firmware, erasing coding history, erasing map (all on a stock, unmodified ECU) all work. Live diagnostics and live data now work.

Still to be done, I need to work out what other commands and routines the Td5 ECU supports and align these with the petrol ECUs. For example there are commands for clearing adaptations on petrol ECUs, the same commands don’t work and have nasty side-effects on Td5 ECUs (so are currently disabled), but there may be equivalents on the Td5 so if I can identify them, I can re-enable the feature for all ECUs and send the command appropriate in each case. The petrol ECUs also have commands to let you manually activate a lot of the ECU actuators, like fuel pump, injectors, ignition coils for testing purposes. It would be good to work out if the Td5 has this feature and what actuators can be supported.

Custom Firmware Patches

1.      The Live Mapping patch is never going to work on a Td5 ECU. Unfortunately, the whole idea is based on having a reasonable amount of spare RAM into which the live tables and scalars can be copied, and the Td5 is just too compromised with only 4kB total for this to be viable.

2.      The Debugging patch is also unfortunately in the same boat. It needs enough RAM to store a meaningful debugging report.

 

3.      The Dual Map Live Switching patches, however, look like they can be adapted. These should be very useful on a Td5 which is very tuneable. The basic mechanism by which the ECU addresses the map is the same as on a petrol. There is typically even more spare space in the firmware area of a Td5 ECU to take second map than in a petro ECU; the ones I’ve looked at so far had about 71kB free in the firmware area which means they will easily fit another 16kB map. The issue is finding suitable unused, CPU-readable inputs to use for the switch. The option of switching maps based on the throttle position at power-on is not available as I believe the native ECU code already uses this input to trigger a fuel system purge.

This is not a question I can answer on emulated hardware, I would need to investigate what is physically available on a real Td5 ECU. Again Sergei has kindly offered to help me:

I’ve developed a screen in MEMS3 Mapper which displays the raw analog and digital inputs readable by the microcontroller, as shown below. Most of the analog read on the left which show 1023 are typical sensor inputs, designed to be used with a variable resistance sensor to ground, resulting in a variable voltage between 0V and 5V at the input terminal. With no sensor connected, the input voltage will be 5V and so the reading with by the maximum scale of the analog to digital converter, which is $3FF hexadecimal or 1023 decimal. By watching this screen while connecting a low resistance from each “unused” input pin to ground, it is possible to identify whether each ECU pin is actually connected to an input circuit internally and if so, how to read it. Doing this on the petrol ECUs I unearthed a couple of additional usable inputs, including Pin 10 which is far as I can see is unused on all production vehicles and therefore available as a general purpose map switch input. Typically each input pin is dual-purpose, being readable as an analog input and a digital input on one of the ports PORTQA or PORTBQ. If we can find even one of those on Td5 ECU, my map switching patch can use the digital read if the pin to determine which map to use, whilst at the same time patching the programming of the analog to digital converter to return 1023 in the results table whatever the input pin voltage is, which means the ECU code which uses the analog read will not even notice what we are doing.



 

Other than for those things, I think the current Version 5.98 Release MEMS3 Mapper now has pretty much full support for Td5 ECUs. I’ll work on the last few features when time allows and when I can get a day in front of a T4 system again.

Collateral Damage

All in all, it cost me a total of four bricked ECUs along the way, which I think is not too bad going! A couple of those I have already mentioned, one I was so excited to get the boot loader flashing working I just sat there flashing different boot loaders one after another and then accidentally picked a stock diesel boot loader, so my own silly fault. It now protects you against flashing incompatible boot loaders. The last one was due to a fault in my program that automated the patching of the boot loaders and firmware files for my library; it made a mess of one and I flashed it before I noticed, but that’s also been fixed. They were only £5 eBay test mule ECUs, and in theory can be recovered with a bit of effort, a hot air soldering station and a bench EEPROM programmer.

Motorola made the ECUs at two plants, Angers and Stotfold. It usually says on the label on the ECU which plant it was produced at. ECUs produced at the Angers plant are a lot easier to recover, as they have bare clean circuit boards inside. ECUs produced at the Stotfold plant have the finished circuit board conformally coated, that is to say sprayed with a thin layer of liquid silicone sealant, which coats everything and completely seals them against moisture but is very difficult to remove and which makes removing components and soldering repairs cleanly a lot harder. Of the four ECUs I bricked, two were from Angers and two from Stotfold. Although the Stotfold ones are repairable, for the price of a replacement eBay test mule it’s not worth the hassle so I think I’ll recover the two Angers ECUs and bin the other two.

 

rj
rj's picture
Offline
Last seen: 2 days 15 hours ago
Joined: 17/04/2014

Your effort keep impressing me Andrew ,-) Wish I had your skills!

tbirds search:

Blatchat Google Search

Dave Hardcastle
Dave Hardcastle's picture
Offline
Last seen: 1 hour 33 min ago
Joined: 01/08/2015

I always feel SO humble when I read your posts, keep up the good work Clap