Board hangs regularly

Discussion in 'Troubleshooting' started by cesco_78, Jun 3, 2015.

  1. cesco_78

    cesco_78 New Member

    Joined:
    Apr 12, 2015
    Messages:
    2
    Likes Received:
    0
    Hi,
    I've a strange issue with my Udoo Board Quad. after a few days of power (24/24h) the board disappear form LAN, no serial response and no video output via HDMI, the only way to recover is hard reboot.
    There is no strange info in syslog and I don't know why the system hangs

    Could you help me please?

    I use official ubuntu sysop, WiFi and ethernet connected and configured, external SATA HD with power from the board, official power supply

    Thanks!!
     
  2. fetcher

    fetcher Member

    Joined:
    Mar 9, 2014
    Messages:
    166
    Likes Received:
    20
    One of my boards has been doing this same thing (well, the watchdog reboots it after -- see below), but on a very infrequent basis, like once every 30-40 days. There is no apparent correlation to system load, temperature, or power consumption. I suspect one of the DDR RAM chips may have a subtle flaw, but unfortunately 'memtest86+' is not available for ARM architecture, and the user-mode 'memtester' is less effective at flushing out sporadic errors. With these BGA surface-mount chips soldered on, replacing a bad one without specialized equipment and skills is out of the question.

    If you'd rather a crashed system auto-reboot rather than staying down, the standard kernel already includes a driver for the i.MX6's onboard watchdog timer circuit-- just install the usermode part with 'apt-get install watchdog', then edt /etc/watchdog.conf to make sure this line is set:

    watchdog-device = /dev/watchdog

    And make sure /usr/sbin/watchdog or /usr/sbin/wd_keepalive start up at system booard time.

    What's the warranty period on Udoo products, anyway? I have a spare that I'll be swapping in for the flaky unit to ensure that solves the problem, but getting that first board unmounted from where it is will quite a chore.
     
  3. Andrea Rovai

    Andrea Rovai Well-Known Member

    Joined:
    Oct 27, 2014
    Messages:
    1,703
    Likes Received:
    240
  4. Andrea Rovai

    Andrea Rovai Well-Known Member

    Joined:
    Oct 27, 2014
    Messages:
    1,703
    Likes Received:
    240
    Hi both,
    can you both post your memtester log? In these cases if the product is still covered by warranty, an RMA is to be considered, that is why I'm asking.
     
  5. cesco_78

    cesco_78 New Member

    Joined:
    Apr 12, 2015
    Messages:
    2
    Likes Received:
    0
    Hi Andrea,
    I've an RMA approved a few days after posting my problem :)

    Hi Fetcher,
    hanks for the hint of "watchdog" I did'n known this software :)
     
  6. Andrea Rovai

    Andrea Rovai Well-Known Member

    Joined:
    Oct 27, 2014
    Messages:
    1,703
    Likes Received:
    240
    Well then!
     
  7. fetcher

    fetcher Member

    Joined:
    Mar 9, 2014
    Messages:
    166
    Likes Received:
    20
    To follow up, I think I've found and removed the source of these mysterious lock-ups. The Udoo in question has had a 16x2 text LCD attached to it (one of these kind, with RGB backlight: http://www.adafruit.com/products/399 ), which requires 5V power, but will accept 3.3V levels on its data pins. The LCD's R/W pin was grounded to ensure it never tried to send data back to the Udoo at 5V levels and that its data pins were always in high-impedance state. For the backlight drive, I used PNP transistors (2n3906 or so) for high-side switching of 5V using 3.3V PWM signals.

    Connected in this way, the LCD and its backlight have both been working perfectly, but despite that, I suspect there may have been some reverse current leakage causing trouble for the Udoo. One oddity I noticed from the start was that with the LCD connected, it was no longer possible to halt/powerdown the i.MX6 for very long -- it would instead reboot itself soon after, even with no watchdog timer active. This only happened with the LCD attached, and perhaps I should have taken that as a warning sign.

    Details of the original hook-up:
    Code:
    Double-row header, ODD-row pins used for HD44780 LCD:
      D7: 29 / gpio135
      D6: 33 / gpio139 (ECSPI2_MISO, mirrored on 50)
      D5: 35 / gpio141 (ECSPI2_SCLK, mirrored on 52)
      D4: 39 / gpio205
      EN: 41 / gpio35
      RS: 43 / gpio33
    
      bklt-R  PWM10/ gpio1  / i.MX PWM 2
      bklt-G: PWM7 / gpio42 / i.MX PWM 4
      bklt-B: PWM6 / gpio41 / i.MX PWM 3
    
    I'd chosen pins that didn't conflict with UARTs, SPI buses or other potentially useful peripherals, and was picking up +5V and GND from the end of that double header also.

    Since unplugging the LCD, the board is now at 50 days' uptime, which it would never achieve before. I plan to reconnect it through an MCP23017 I/O expander chip behind the I2C bus, which already goes through a 3.3V to 5V level-shifter to support other I2C devices running at 5V (DS2482-100 1-wire bus controllers, other MCP23017 and MCP23008 GPIO chips). This will sacrifice the ability to do PWM-based backlight dimming, but should hopefully isolate any stray backfeed voltage from the i.MX6.
     
  8. orfeus

    orfeus New Member

    Joined:
    Feb 25, 2014
    Messages:
    9
    Likes Received:
    0
    Same strange behaviour here, udoo run 24/7 and after few weeks it just stop respond ( udoobuntu 1.1, last stop 2 day back, i can report next). GPIO which value is setup as high, stay high. I did not configure watch dog yet, so everytime I do manual shutdown.

    RMA is not solution for me, i need UDOO to do his work ..:-/
     
  9. fetcher

    fetcher Member

    Joined:
    Mar 9, 2014
    Messages:
    166
    Likes Received:
    20
    orfeus, do you have the CN6 serial console connected full-time to a system that can log all received data? If not, this would be a good idea to capture a possible kernel panics message. With my lockup problem, though, which I believe was due to attached hardware (see above-- hopefully solved!), there was never any kernel message... just the U-boot output as the watchdog times out and forces a reboot.
     
  10. orfeus

    orfeus New Member

    Joined:
    Feb 25, 2014
    Messages:
    9
    Likes Received:
    0
    Hi fetcher, thanks for tip to connect serial console, i will try it and report back.

    On my udoo is only connected few relayboards and one usb arduino uno.

    Odesláno z mého Nexus 6 pomocí Tapatalk
     

Share This Page