Cleaning Lidar Data: Object Identification

Cleaning Lidar Data: Object Identification

By Gary Angel


May 10, 2024

lidar data quality

Ghosts, fragments, reflections and munged objects are common defects in lidar data. How common? There’s no real answer to that question. It depends on the quality of the lidar, their number and placement, the perception software, the complexity of the environment, and how crowded a location is. Put a couple of high beam-count lidars in a big open room and track a single person and your results will be perfect. But as you add occlusions, environmental factors (like wind or rain), and – most of all – crowds, you’ll start to see more and more data problems. Our experience with real-world implementations suggests that even with the best equipment in fairly controlled environments, the data will need some cleaning before usage. In difficult environments, the data will need a LOT of cleaning before use. People measurement is complicated and anyone who’s ever worked on physical implementations knows that every location tends to have its own unique complexities and complications.


Before you start cleaning the data yourself, it’s worth paying attention to the cleaning options you have at the Perception software level. The better the data that comes from that layer, the better your overall data quality will be. We’re often reluctant, though, to take full advantage of the feature set at the Perception layer. Drawing exclusion zones in the Perception software is a blunt instrument since it kills all reporting in that area. Similarly, allowing the Perception software to blend tracks is nearly always a disaster because once you’ve blended a track into a single id you can’t unblend it and A) the software often sucks at it, and B) the Perception layer lacks context and history that can make stitching better.


However, if there are areas you flat out don’t want to measure, then excluding them in the lidar is useful. In big open areas (no walls), drawing a measurement boundary of the area with good detection is very worthwhile. System edges will generate a LOT of bad data that can skew metrics and make the system look much worse than it is. People viewing the data won’t understand that the edge of a scene has much worse data quality. Finally, setting appropriate thresholds for object identification will often remove a lot of garbage from the system and will make clean-up and stitching significantly easier.


That said, most of the cleaning we do is post-Perception, and I’ll broadly group these cleaning strategies into behavioral, area-based, dimension-based techniques along with combinations of all three.



Behavior-Based Cleaning


Many of the most common lidar data quality problems generate tracks with very limited and distinct movement patterns. If lidar is tracking a swinging door or a tree swaying in the wind or a gas pump handle being moved, you’ll see an object with movement. That object may persist for a bit or be very ephemeral. It may appear and re-appear over and over.


It can be trivial to get rid of these fragments with broad rules that simply eliminate any record that doesn’t move a certain amount (like 2 meters) or that doesn’t last for at least a specific duration (like 2 seconds). This can work. But it can also create a whole raft of new problems. This is especially true when you’re seeing a lot of track-breakage. Those fragmentary records can be important in helping stitch together a real track.


Think about a scene like this as a pedestrian walks (or jogs) between the trees. Each tree may cause a break in the track record if the measurement lidar is across the street. That will produce a series of tracks, each of which might be quite short and quick. Stitch them together and you have a coherent journey.


1712850959506?e=1720656000&v=beta&t=7oz5hz1Ra6CNyFQozrL6Tgi-sUh5ulsdhyvp4FfWy9Y Cleaning Lidar Data: Object Identification


You may think that the distance between the trees is large enough that you won’t get short fragments, but that’s probably not true. The problem (which I’ll cover in another post) is that there is a time asymmetry in lidar between detecting stops and starts. When the object is occluded, measurement stops almost instantly. But when it re-appears most lidar systems will take a little over a second to re-acquire it. That means most of the distance between the trees is lost to re-acquisition.


From a practical perspective, this means we like to have two classes of rules – pre and post stitch. Pre-stitch, we’re very careful about removing fragments unless they are very short and show very little movement. Post-stitch, we’re much more aggressive about fragment removal. If we couldn’t find a match and all we have is 2-3 seconds of not much movement, we’ll kill the record.


This two-level strategy for fragment removal is much more reliable.


We also tend to require a combination of both short time and short distance to remove a fragment. Particularly in restricted scenes or locations with fast-movers (like cars), an object can go quite a distance in 3 to 5 seconds. It’s also true that most locations will have valid object fragments that show very little motion but considerable time. Retail stores are dense with occlusions, and we frequently lose shoppers as they walk behind displays or bend down to look low on a shelf. If a bent-down shopper then stands straight and spends thirty seconds at the display, we want to be able to stitch them as they leave that display. Yet they may hardly have moved.


The good thing about only eliminating tracks with both short time and distance pre-stitch is that those records generally won’t prevent a correct stitch from happening.


Why remove ANY records pre-stitch? Sometimes we don’t. But stitching broken tracks is complicated and the fewer alternatives to consider, the more likely we are to get it right. Cleaning up fragments can make stitching significantly more accurate.


It’s also important to keep in mind that movement differentials can be influenced by object type. One of the oddities of lidar measurement is that movement in an X, Y sense is centroid based. But the centroids of objects often shift somewhat unpredictably in the point cloud. That means that a person standing still will often move a little bit – usually less than ½ a meter. But a vehicle point cloud is much bigger and the centroid shifts can be larger. A parked vehicle may register sudden shifts in position that are 1-2 meters in distance. Nothing has changed, and it’s not unusual for the object centroid to shift right back.


One other behavior commonly used for cleaning is velocity. If we see objects that are moving much faster than a person in a retail store really can, we’ll filter them. Velocity is trickier in mixed object environments (where it’s often used for classification not identification), but in single object environments it can be powerful. It can also be used to detect invalid frame jumps in the Perception software. We find that most Perception software vendors will sometimes jump from a point-cloud to a nearby one with a 1-2 second gap. That’s not behavior we like to see in the software. It’ sometimes correct, but it’s usually wrong. We’ll break that track and then allow our much more sophisticated stitching engine to decide if it’s correct or not. And one of the ways we decide about breakage is by looking at the velocity of the implied jump. As far as we can tell, that’s not something the Perception software vendors ever do.



Area-Based Cleaning


I recommended against using zones in the lidar to do exclusions because it is such a blunt instrument, but that isn’t the only drawback. Very few users know how to use the Perception software, it’s often kind of inaccessible, and mucking around with it can cause huge data problems. It can also be hard to use, since it’s built for engineers. For all those reasons, doing area-based cleaning is best done above the Perception-layer.


What kind of area-based cleaning is appropriate? First, there may be full on exclusion areas where you never want to track. More commonly, there are areas where you don’t want to track objects that only start and remain in that area. That’s an important distinction. If your lidar is getting fooled by a door, you don’t want to exclude tracking in the door area, but you might want to exclude any fragment that starts in and never leaves that area.


Likewise, different areas may require different cleaning strategies. In a seating area, you’re going to want to be very generous with long timeframe objects that don’t move. On a highway? Not so much. We set overall cleaning rules for a location and then implement area-specific ones whenever necessary that either strengthen or loosen those boundaries.


The beauty of area-based cleaning is that in many, many real-world situations, knowing where something is tells you a lot about what it is and how it should behave. Perception software systems don’t allow for any kind of flexible rule-making, but your People-Measurement-Platform (PMP) can and so can any hands-on engineer.


If you are rolling your own rule-set, this is an area where developers often need a little prodding. Developers hate rule sets that make exceptions here or there – and (being a programmer by background) I get it. But in a geo-location system, area-based rules are part of what the system should do.



Dimension-Based Strategies


One of the most powerful aspects of 3D lidar is its ability to understand the true shape and size of an object. I might write a full post on this and why it’s so distinguishing from camera-based systems that must do a lot of ML to figure those things out. Yet as powerful as it is, lidar object dimensions can be tricky, so dimensions tend to be under-utilized in Perception software identification. This is a case where you may need to carefully evaluate your data before figuring out whether and when dimensioning can be an effective part of data-cleaning.


With all lidar Perception software, you’ll receive an object list that can optionally include a dimension box (x,y,z) size as well as the normal positional coordinates of the object centroid. This dimension box is much less detailed than the point-cloud it’s generated from. Still, it’s more than enough to use in a variety of situations. While many lidar Perception systems will allow you to set default sizes for object detection, these tend to be set to very low values. Adjusting that is definitely worth considering. If you detect a shoe-box size object in a store, it’s not a person.


Or is it?


The challenge with lidar box-sizes is that they are very vulnerable to angles and occlusions. If all the lidar sees is somebody’s head, that’s what the box size will represent. A good Perception software system will use multiple sensors to construct a full 3-D point cloud, but that 3-D point view is only available if more than 1 lidar sensor has beams on the target. If you only have one lidar sensor covering an area, you’ll almost never get a true 3-D representation of the object. When you have multiple lidars, things get much better – but how much better is a complicated question. Lidars have very long ranges and will often create significant overlap. But they are also vulnerable to occlusion. What that means is that in some places you’ll have good overlap and in others you won’t. And in some places where you have overlap, sometimes you’ll get a full 3-D point cloud and sometimes you won’t (as crowds create occlusion on different angles of view).


To make things even more complicated, most lidar object feeds don’t tell you how many sensors had beams on the object or how many points each one had. The net is that you need to understand your deployment very well before you start making decisions about how to use box-size to clean-up data identifications.


One advantage that post-Perception dimension cleaning has is that it can take advantage of the dimensions across many frames. Each frame comes with a dimension and because of the issues described above (and others), the dimensions of an object will often change dramatically over time as the lidar’s view changes. That’s a good thing. It gives you an over-time average dimension and a maximum dimension to use. The angle and nature of occlusions usually cause objects to look smaller than they are. That means the biggest size you get is often the most indicative. Like almost everything else with data quality and real-world people-measurement, there are some exceptions to this rule, but by and large it’s useful.



Combination Techniques


It’s often a good idea to combine these three techniques into specific rules for data cleaning. If you know that you’re looking at an active roadway and your primary goal is to count cars, you can do things with dimension sizing that won’t work if you’re also getting data from, for example, the curbside area of an airport where you need to count cars, people and (perhaps) luggage.


The whole point of downstream data quality improvement (post-Perception) is to take advantage of what you know. That means using where an object is identified, what it does over time, and how its dimensions look (and change) over time.


Putting these kinds of cleaning rules together can feel like a kludge, but it isn’t. It’s just smart to use what you know to make data better.


In my next post, I’ll take a look at how to improve the down-stream quality of your Perception software’s object classification. It will be a lot of the same techniques, but with some new twists.

Leave a Reply