Version 2 (modified by pb, 12 years ago) (diff)

--

WAM Troubleshooting

Checking the Error Log

One of the first steps to diagnosing many WAM problems is to examine the error log file /var/log/syslog. Here is a normal startup log for a 7-DOF WAM:

Jul 29 20:27:41 WAM WAM: ...Starting btdiag program...
Jul 29 20:27:44 WAM WAM: Waking all pucks
Jul 29 20:27:44 WAM WAM: getBusStatus(): canReadMsg returned error
Jul 29 20:27:45 WAM last message repeated 22 times
Jul 29 20:27:45 WAM WAM: getBusStatus: Only status != -1 is shown.
Jul 29 20:27:45 WAM WAM: getBusStatus: status[1] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[2] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[3] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[4] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[5] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[6] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[7] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[10] = 2
Jul 29 20:27:45 WAM WAM: About to allocate space for 8 nodes
Jul 29 20:27:45 WAM WAM: getBusStatus(): canReadMsg returned error
Jul 29 20:27:45 WAM last message repeated 22 times
Jul 29 20:27:45 WAM WAM: getBusStatus: Only status != -1 is shown.
Jul 29 20:27:45 WAM WAM: getBusStatus: status[1] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[2] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[3] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[4] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[5] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[6] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[7] = 2
Jul 29 20:27:45 WAM WAM: getBusStatus: status[10] = 2
Jul 29 20:27:45 WAM WAM: Puck: ID=1 CTS=4096 IPNM=2700.00 PIDX=0 GRPB=1
Jul 29 20:27:45 WAM WAM: Puck: ID=2 CTS=4096 IPNM=2562.00 PIDX=1 GRPB=1
Jul 29 20:27:45 WAM WAM: Puck: ID=3 CTS=4096 IPNM=2562.00 PIDX=2 GRPB=1
Jul 29 20:27:45 WAM WAM: Puck: ID=4 CTS=4096 IPNM=2700.00 PIDX=3 GRPB=1
Jul 29 20:27:45 WAM WAM: Puck: ID=5 CTS=4096 IPNM=4961.00 PIDX=0 GRPB=2
Jul 29 20:27:45 WAM WAM: Puck: ID=6 CTS=4096 IPNM=4961.00 PIDX=1 GRPB=2
Jul 29 20:27:45 WAM WAM: Puck: ID=7 CTS=4096 IPNM=17474.00 PIDX=2 GRPB=2
Jul 29 20:27:45 WAM WAM: Actuator data dump:
Jul 29 20:27:45 WAM WAM: [0]:Bus-0,ID-1,G-1,O-0,M-0,Off-0,Enc-4096
Jul 29 20:27:45 WAM WAM: [1]:Bus-0,ID-2,G-1,O-1,M-0,Off-0,Enc-4096
Jul 29 20:27:45 WAM WAM: [2]:Bus-0,ID-3,G-1,O-2,M-0,Off-0,Enc-4096
Jul 29 20:27:45 WAM WAM: [3]:Bus-0,ID-4,G-1,O-3,M-0,Off-0,Enc-4096
Jul 29 20:27:45 WAM WAM: [4]:Bus-0,ID-5,G-2,O-0,M-0,Off-0,Enc-4096
Jul 29 20:27:45 WAM WAM: [5]:Bus-0,ID-6,G-2,O-1,M-0,Off-0,Enc-4096
Jul 29 20:27:45 WAM WAM: [6]:Bus-0,ID-7,G-2,O-2,M-0,Off-0,Enc-4096
Jul 29 20:27:45 WAM WAM: Data for Bus 0
Jul 29 20:27:45 WAM WAM: Bus Data: There were 7 Pucks sorted by ID
Jul 29 20:27:45 WAM WAM: [0]: Actuator 0 Puck 1
Jul 29 20:27:45 WAM WAM: [1]: Actuator 1 Puck 2
Jul 29 20:27:45 WAM WAM: [2]: Actuator 2 Puck 3
Jul 29 20:27:45 WAM WAM: [3]: Actuator 3 Puck 4
Jul 29 20:27:45 WAM WAM: [4]: Actuator 4 Puck 5
Jul 29 20:27:45 WAM WAM: [5]: Actuator 5 Puck 6
Jul 29 20:27:45 WAM WAM: [6]: Actuator 6 Puck 7
Jul 29 20:27:45 WAM WAM: Bus Data: There were 2 Groups 
Jul 29 20:27:45 WAM WAM: Group 1: A0 P1 A1 P2 A2 P3 A3 P4
Jul 29 20:27:45 WAM WAM: Group 2: A4 P5 A5 P6 A6 P7 A-1 P0
Jul 29 20:27:45 WAM WAM: device_name=WAM7
Jul 29 20:27:45 WAM WAM: wam->name=WAM7
Jul 29 20:27:45 WAM WAM: bus=0, num_actuators=7
Jul 29 20:27:45 WAM WAM: OpenWAM(): for link from 0 to 7
Jul 29 20:27:45 WAM WAM: parseGetVal: key not found [WAM7.link[7].rotorI]
Jul 29 20:27:45 WAM WAM: Motor[0] is joint 0
Jul 29 20:27:45 WAM WAM: Motor[1] is joint 1
Jul 29 20:27:45 WAM WAM: Motor[2] is joint 2
Jul 29 20:27:45 WAM WAM: Motor[3] is joint 3
Jul 29 20:27:45 WAM WAM: Motor[4] is joint 4
Jul 29 20:27:45 WAM WAM: Motor[5] is joint 5
Jul 29 20:27:45 WAM WAM: Motor[6] is joint 6
Jul 29 20:27:45 WAM WAM: WAM zeroed by application
Jul 29 20:27:45 WAM WAM: About to set safety limits, VL2 = 46
Jul 29 20:27:45 WAM WAM: WAMControl period Sec:0.002000, ns: 2000000

Common Problems

The symptoms repeated in this section were either generated by Barrett’s own lab WAMs or were reported by Barrett’s customers.

Problem: Can not log in to the WAM PC

Reason: Someone changed the password

Solution(s):

  1. Ask the people in your lab for the new password

Reason: The drive is full (External PC)

Solution(s):

  1. If the drive is full, the system is not able to record your login event, and the login will fail. It is likely that the /var/log/syslog files are very large, especially if you are logging data or errors from within the WAM control loop. Boot from a bootable CD (like sysresccd.ora., and delete the syslog files (or other files) to make some room.

Reason: The network connection is not active (logging in over the network)

Solution(s):

  1. Make sure the Ethernet cable is plugged into the PC
  2. Make sure your DHCP server is active (if using DHCP)
  3. Check that the Ethernet driver is loaded (from local terminal): lspci |grep –i eth; dmesg |grep –i eth; lsmod

Reason: The drive is corrupted or damaged due to jarring, poor ventilation, or a software bug.

  1. Try a new hard drive (external PC) O/S installation instructions are on wiki.barrett.com.
  2. Try a new CompactFlash (internal PC) Contact Barrett for a replacement.

Problem: WAM control application crashes/segfaults

Reason: The WAM PC can not communicate with the Safety Board at program launch

Solution(s):

  1. Turn on the WAM power supply.
  2. Make sure that the CANbus cable is securely plugged in.
  3. Make sure that the CANbus cable is plugged into the correct port on the PC (external WAM PCs)
  4. Make sure that the CANbus cable is properly terminated- there is a wrist or blank outer link attached.
  5. Make sure the CANbus card is properly seated in its PCI slot (external WAM PCs)
  6. Make sure the CAN driver is installed and loaded correctly: dmesg |grep –i pcan, cat /proc/pcan, lsmod
  7. Check for a broken CANbus or power wire, usually in a connector or at the safety puck.
  8. Check for a loose CANbus or power crimp in the connectors.
  9. Attach the Puck serial cable to the safety board to verify that the safety puck is functional. Could it have overheated?

Reason: You are accessing data outside of the program’s memory space (not likely with standard example programs).

Solution(s):

  1. Use gdb to determine the offending line of source code.

Problem: The pendants do not initialize when the WAM power supply is turned on

Reason: The Pendant(s) are malfunctioning

Solution(s):

  1. Turn off the power, and remove the Pendants from the backplate.
  2. Turn on the power, and listen for the relay to click, and the PC/104 to beep.
  3. If the relay clicks and the PC/104 beeps, the safety board is working, but the pendants are not. Contact Barrett for repair.
  4. If the relay does not click, and the PC/104 does not beep, the safety board is not working.

Reason: The safety board fuse is blown

Solution(s):

  1. Remove, the fuse, reference designator F1, on the safety board.
  2. Test the fuse for continuity using a Multi-meter. If fuse is open, replace the fuse with a new one. Fuse type is 250V, 10A, 5x20mm Slo-Blo.

Reason: There is no power

Solution(s):

  1. Check for proper voltage at the WAM's power supply source (wall or battery)
  2. Check for any custom current limiting circuits you may be using. This may include the current limit dial on a variable power supply, or self limiting circuitry on commercial power supplies
  3. Check the output voltage of the power supply (should be 48-51 VDC, nominal)

Reason: The Puck is in Firmware Download Mode

Solution(s):

  1. Make both positions of switch SW3, near the puck, is in the correct position. They should be away from the puck (towards PC/104) for normal operation. If they are not, the puck will not start the safety board.

Reason: There is some other electrical problem

Solution(s):

  1. Contact Barrett for repair

Problem: The pendant initialization sequence is incorrect when the WAM power supply is turned on

Reason: The Data/Clock/Latch signals to the pendants are weak

Solution(s):

  1. Make sure both pendant cables are plugged securely into the WAM

Reason: There is some other electrical problem

Solution(s):

  1. Contact Barrett for repair

Problem: Some joints have resistive braking, some do not. The angles that btdiag returns for the joints without resistive braking are incorrect. This is easiest to check for at the home position.

Reason: The Pucks without resistive braking are not powering up correctly. NOTE: It is normal to have no resistive braking in all joints after turning on the WAM and pressing Shift-Idle, but before you launch a WAM control program.

Solution(s):

  1. Check for a broken CANbus or power wire, usually in a connector near the Puck.
  2. Check for a loose CANbus or power crimp in the connectors near the Puck.
  3. Check for a loose CANbus or power wire at the Puck itself.
  4. Check the syslog for clues (compare it with the syslog abova.
  5. Contact Barrett for repair

Problem: Joint readings “bounce” from reasonable/actual values to values that are not/cannot be true and back again.

Reason: Realtime control violation

Solution(s):

  1. Make sure the control loop avoids system calls that are incompatible with realtime control: No UI such as printf() or getch(), no I/O such as read() or write()- just about anything except for loops and pure math.
  2. Make sure your total WAM control loop time, including the WAMCallback(), does not exceed the time allowed by the control rate. Keep in mind that the data time-of-flight on the CANbus is approximately 850 uS for a 7-DOF and 500 uS for a 4-DOF. This takes up a significant amount of the total control loop time.
  3. If you built your own WAM PC, a system hardware interrupt might be causing a realtime glitch. Check your installation against: http://www.rtai.dk/cgi-bin/gratiswiki.pl?Latency_Killer
  4. Extra getProperty() or setProperty() calls could be exceeding the CANbus bandwidth. setProperty() takes 75 uS without verification, getProperty() takes 150 uS. You might try staggering these function calls across multiple control cycles.

Reason: CANbus communication problem

Solution(s):

  1. Use the btutil utility application to verify that the Puck firmware versions match for all motor Pucks. Load matching firmware onto all motor Pucks, if necessary.
  2. Ensure that CANbus is properly terminated to minimize signal reflections. There is a 120 Ohm termination resistor in the wrist module (7-DOa. and blank outer link (4- DOa.. If you are using the internal WAM PC, switches 1-2 and 1-3 should be “out” (see Section 1.5). This terminates the CANbus at the Safety Board with a 120 Ohm resistor. If you are using an external WAM PC, the purple CANbus cable has a 120 Ohm resistor in the connector at the PC end. In this case, switches 1-2 and 1-3 should be “in”.
  3. Use btutil to ensure that there are no conflicting Puck IDs on the CANbus. Each Puck must have a unique ID. If the Pucks are not enumerating correctly, disconnect them from the CANbus on-by-one until you find the one causing the conflict. Use btutil to set a new Puck ID for that puck.

Reason: Grounding problem

Solution(s):

  1. Eliminate CANbus ground loops. The CANbus shield is connected to Earth ground on the Safety Board. If your PC is also grounded, you should use an opto-isolated CANbus card to prevent ground loops.
  2. Eliminate power bus ground loops. The frame of the WAM is connected to Earth ground through the blue DC power cable. The frame of the 48 VDC supply is also Earth-grounded. If there is an additional electrical connection between the frame of the WAM and the frame of the power supply (such as a metal table or mounting bracket), this can cause a ground loop and undesired operation.

Problem: Running “sh makeall” causes a number of error messages appear, including “cannot find –lntcan”.

Reason: The linker is looking for the wrong CAN driver

Solution(s):

  1. Edit your btclient/config.mk file to specify the correct CAN driver

Reason: The CAN driver is not installed

Solution(s):

  1. Install the correct CAN driver for your CAN hardware

Problem: A mechanical cable is fraying, birdnesting, or has come loose

Reason: The maximum recommended payload was exceeded

Solution(s):

  1. Note that the payload specification does not take into account accelerated loads. If you attach a 3 kg load to the 4-DOF and accelerate it at 2 g’s, the WAM experiences a 6 kg load, exceeding the recommended payload. The cable should be replaced before further use.

Reason: A brass termination slipped off the end of the cable.

Solution(s):

  1. This is a manufacturing defect. Contact Barrett to get the cable replaced.

Problem: During operation, you hear a “popping” sound accompanied by a distinct nasty smell.

Reason: An electrical component burned up.

Solution(s):

  1. Carefully record all events leading up to the failure. Contact Barrett for repair.

Problem: The WAM returns to home position before starting a new trajectory.

Reason: The first point in the trajectory is the home position.

Solution(s):

  1. Inspect the trajectory file with a text editor to determine if this is the case. Every trajectory file is just a comma-delimited file with time (in seconds) in the left column and (depending on the moda. the joint positions or the Cartesian positions in the remaining columns.

Problem: A slight knocking can be heard when moving a particular cable circuit of the WAM (it may also be felt if backdriving the WAM by hana.. As the knock occurs, a slight backlash may be felt.

Reason: Debris in a joint bearing.

Solution(s):

  1. Apply several drops of oil (3-in-1, or WD-40 in liquid form or another light viscosity mineral oil should work). The debris has about a 50% chance of working itself out eventually.

Reason: Loose ball-bearing retainer

Solution(s):

  1. Eliminated in newer designs, usually does not interfere with operation except for clicking noise.

Reason: Insufficient motor shaft axial preload/shimming – the rotor/shaft should not be able to move axially inside the motor.

Solution(s):

  1. Return to Barrett for repair.

Problem: The robot fails to follow scaled trajectories.

Reason: The WAM does not have the latest OS/firmware/software.

Solution(s):

  1. Follow the directions on wiki.barrett.com to update your firmware/software.

Problem: Gravity Compensation does not operate correctly.

Reason: The wam.conf file links to the incorrect configuration file based on the present WAM setup

Solution(s):

  1. Change the link in wam.conf to the correct configuration file: WAM4 (4-DOa., WAM7 (4DOF with Wrist), or WAMG (4DOF with gimbals)

Reason: The parameters in the configuration file do not match your system’s actual configuration.

Solution(s):

  1. Check the linked configuration file for the following:
    • Correct the tool center point (DH parameters d, a, alpha.. These are offsets from the final link’s frame.
    • Correct the tool Center-Of-Mass (COM), relative to the tool center point
    • Correct the tool mass
    • In pre-2007 WAMs, Motor 4 (M4) was located over M

Reason: After 2007, M4 was reversed and located over M

Reason: The transformation matrices need to match your WAM’s M4 configuration. Basically, the sign changed from + to – for the M4 transmission.

  • The latest files in config/ will not necessarily match your WAM. You should always use the config file that shipped with your WAM. Update that file with *new features* from the latest config files, but be careful not to change the kinematic and dynamic constants that are specific to your WAM.

Reason: Mass parameters are inaccurate due to model errors, unmodeled parts, or machinist tolerances.

Solution(s):

  1. On-line calibration (either static or dynamia.can yield better data than the model. Barrett is working on calibration software and will notify the WAM User List when it is released.

Reason: Electrical wiring stiffness requires additional joint torque to overcome pulling effect near joint limits

Solution(s):

  1. This requires complex modeling algorithms that Barrett has not yet developed.

Reason: We depend on a sampled motor torque constant to be accurate across a batch of motors in a manufacturing run. In fact, the torque constants could vary slightly.

Solution(s):

  1. Calibration software could reduce errors due to inaccurate torque constants. Barrett is working on calibration software and will notify the WAM User List when it is released.

Problem: Safety Board is randomly rebooting. The pendants reinitialize themselves.

Reason: There are electrical noise spikes/dips in the power circuitry of the Safety Board.

Solution(s):

  1. Contact Barrett for a replacement board.

Problem: Pendant lights do not come on (specifically the IDLE light), but Pucks are online anyways, but do not make PWM sound when <SHIFT+ACTIVATE> is pressed.

Reason: The pendant is electrically damaged

Solution(s):

  1. Contact Barrett for repair.

Problem: One or more joints of the WAM vibrate during trajectory following.

Reason: The PID control gains for the joint are too high. The default PID control gains assume some non-zero payload in gravity. If you have no payload, or the joint is facing the floor or the ceiling (that is, if gravity is not a factor and there is no externally-applied torque on the joint), then the joint control gains could be too high.

Solution(s):

  1. Lower the PID gains for that joint (in the wam.conf file) Start by cutting P and D in half.

Problem: One or more joints of the WAM oscillate or otherwise go unstable during trajectory following.

Reason: The PID control gains for the joint are too low.

Solution(s):

  1. Raise the PID gains for that joint (in the wam.conf file)

Reason: The payload is too heavy, the motor torque is saturating at Max Torque (MT), and the control can not keep up.

Solution(s):

  1. Reduce the payload. Remember that you must take into account accelerated loads while staying under the max payload specification. For example, the max payload for the 7DOF is 3 kg. This means you can accelerate 1.5 kg at 2G. Or you can hold 3 kg statically at the max reach.
  2. Increase the MT parameter for the motors. However, the default MT values are set to stay within the breaking strength of the steel cables. Increasing these values increases the likelihood of cable failure.

Reason: The simple PID control that ships with the example code is not robust enough to handle loads offset significantly from the tool center point or loads with unusual inertial characteristics.

Solution(s):

  1. Design a more robust control method for your application

Problem: WAM will not enter Idle mode—the WAM starts with a low voltage fault, and when <Shift+Reset/Idle> is pressed, the pendants reinitialize instead of entering Idle mode.

Reason: The Puck power bus is shorted together. This could be due to a stuck relay, loose wire, metallic debris on Safety Board, or a damaged Puck.

Solution(s):

  1. Turn off power, and remove the three pin connector J9 from the safety board. Do not move the WAM once J9 is disconnected, it is no longer braked.
  2. Try to shift Idle again. If the safety board re-initializes, then there is an issue with the safety board. Contact Barrett for repair, or repair procedure.
  3. If the board no longer re-sets on shift idle, then the short circuit is in one of the Pucks in the WAM. Remove the Bhand, (if applicable), and re-Idle. If there is no failure, the problem is with the Bhand. Contact Barrett for repair.
  4. Remove the Wrist (if applicable, and try to Re-Idle. If there is no failure, the problem is with the wrist. Re-attach the hand to the outer link, and make sure there are also no faults on the Bhand.
  5. remove the power connector from each of pucks 2-4, and re-idle. Each time there is no failure, re-connect a puck. Each time there is a failure, keep that puck disconnected.
  6. After determining which of pucks 1-7 or 11-14 is the cause of the failure, contact Barrett for repair.