Visual navigation and servoing for object manipulation with mobile robots

(1)

manipulation with mobile robots

DISSERTATION

submitted in partialfulllment

of the requirements forthe degree

Doktor Ingenieur

(Dotor of Engineering)

inthe

Faulty of EletrialEngineering and InformationTehnology

atTU Dortmund University

by

Dipl.-Ing. Thomas Nierobish

Shwäbish Gmünd, Germany

Date of submission: 21th January2014

First examiner: Univ.-Prof. Dr.-Ing. Prof. h.. Dr. h.. Torsten Bertram

Seond examiner: Univ.-Prof. Dr.-Ing. Bernd Tibken

Date of approval: 22th June 2015

(2)

frontier, it is exiting and disorganised; there is often no reliable

authority to appeal to - many useful ideas have no theoretial

grounding, and some theories are useless in pratie."

Forsyth and Pone

Authors fromComputer Vision: A Modern Approah

(3)

Inthe future, autonomousservierobots are supposed toremovethe burden of monotoni

and tedioustasks like pikup and delivery from people. Vision being the most important

human sensor and feedbak system is onsidered to play a prominent role in the future

of robotis. Robust tehniques for visual robot navigation, objet reognition and vision

assisted objet manipulation are essential in servie robotis tasks. Mobile manipulation

in servie robotis appliations requires the alignment of the end-eetor with reognized

objets of unknown pose. Image based visual servoing provides a means of model-free

manipulationof objetssolely relying on2D image information.

In this thesis ontributions to the eld of deoupled visual servoing for objet manipula-

tion as well as navigation are presented. A novel approah for large view visual servoing

of mobilerobots is presented by deoupling the gaze and navigation ontrol via a virtual

amera plane, whih enables the visual ontroller to use the same naturallandmarks e-

iently over a largerange of motion. In order toomplete the repertoire of reative visual

behaviors an innovative door passing behavior and an obstale avoidane behavior using

omnivision are designed. The developed visual behaviors represent a signiant step to-

wards the model-free visual navigation paradigm relying solely on visual pereption. A

novelapproahfor visualservoing based onaugmented image featuresis presented, whih

hasonlyfouro-diagonalouplingsbetweenthevisualmomentsandthedegreesofmotion.

As the visual servoing relies on unique image features, objet reognition and pose align-

mentof the manipulator relyonthe same representation of the objet. Inmany senarios

the features extrated in the referene pose are only pereivable aross a limited region

of the work spae. This neessitates the introdutionof additional intermediate referene

views of the objet and requires path planning in view spae. In this thesis a model-free

approah for optimal large view visual servoing by swithing between referene views in

order tominimizethe time toonvergene is presented.

The eieny and robustness of the proposed visual ontrolshemes are evaluated inthe

virtualrealityandontherealmobileplatformaswellasontwodierentmanipulators. The

experimentsareperformedsuessfullyindierentsenariosinrealistioeenvironments

withoutanyprior struturing. Therefore thisthesis presentsamajor ontributiontowards

visionas the universal sensor for mobile manipulation.

(4)

Autonome Servieroboter sollen in Zukunft dem Menshen monotone und körperlih an-

strengendeAufgaben abnehmen,indemsiebeispielsweiseHol-undBringedienste ausüben.

Visuelle Wahrnehmung ist das wihtigste menshlihe Sinnesorgan und Rükkopplungs-

systemundwirddahereineherausragendeRolleinzukünftigenRobotikanwendungenspie-

len. Robuste Verfahren für bildbasierte Navigation, Objekterkennung und Manipulation

sind essentiell für Anwendungen in der Servierobotik. Die mobile Manipulation in der

Servierobotik erfordert die Ausrihtung des Endeektors zu erkannten Objekten in un-

bekannterLage. DiebildbasierteRegelungermöglihteinemodellfreieObjektmanipulation

allein durh Berüksihtigung der zweidimensionalen Bildinformationen.

ImRahmendieserArbeitwerdenBeiträgezurentkoppeltenbildbasiertenRegelungsowohl

fürdieObjektmanipulationalsauhfürdieNavigationpräsentiert. EinneuartigerAnsatz

für die bildbasierte Weitbereihsregelung mobiler Roboter wird vorgestellt. Hierbei wer-

dendieBlikrihtungs-undNavigationsregelungdurheinevirtuelleKameraebeneentkop-

pelt, was es der bildbasierten Regelung ermögliht, dieselben natürlihen Landmarken ef-

zientübereinenweitenBewegungsbereihzuverwenden. UmdasRepertoiredervisuellen

Verhalten zu vervollständigen, werden ein innovatives Türdurhfahrtsverhalten sowie ein

HindernisvermeidungsverhaltenbasierendaufomnidirektionalerWahrnehmungentwikelt.

DieentworfenenvisuellenVerhaltenstelleneinenwihtigenShrittinRihtungdesParadig-

mas derreinenmodellfreienvisuellenNavigationdar. Einneuartiger Ansatzbasierend auf

BildmerkmalenmiteinererweitertenAnzahlvonAttributenwirdvorgestellt,dernaheiner

Entkopplung der Eingangsgröÿen nur vier unerwünshte Kopplungen zwishen den Bild-

momenten und den Bewegungsfreiheitsgraden aufweist. In vielen Anwendungsszenarien

sinddieextrahiertenReferenzmerkmalenurineinembegrenztenBereihdesArbeitsraums

sihtbar. Dieserfordert dieEinführungzusätzliher Zwishenansihten des Objektessowie

eine Pfadplanung im zweidimensionalen Bildraum. In dieser Arbeit wird deswegen eine

modellfreieMethodikfürdiezeitoptimalebildbasierteWeitbereihsregelungpräsentiert,in

der zwishen den einzelnen Referenzansihten umgeshaltet wird, um die Konvergenzzeit

zu minimieren.

DieEzienzundRobustheitdervorgeshlagenenbildbasiertenReglerwerdensowohlinder

virtuellenRealität alsauh auf der realenmobilen Plattformsowie zweiuntershiedlihen

Manipulatorenveriziert. DieExperimentewerdeninuntershiedlihenSzenarieninalltäg-

lihen Büroumgebungen ohne vorherige Strukturierung durhgeführt. Diese Arbeit stellt

einen wihtigenShritt hin zu visuellerWahrnehmung alseinzigerund universeller Sensor

für diemobile Manipulationdar.

(5)

1 Introdution 1

1.1 Mobile manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Relatedwork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Objetive of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 State of the art of omputer vision and visual servoing 11 2.1 Perspetive amera,multiple-viewgeometry and omnivision . . . . . . . . 11

2.2 Robustpointfeature detetion for reognition . . . . . . . . . . . . . . . . 14

2.3 Visualnavigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4 Image-based visualservoing . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.5 Experimental systems for visualservoing, navigation and loalization . . . 27

3 From vision guided to visual navigation of mobile robots 29 3.1 Vision-guidednavigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.1.1 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.1.2 Topologial loalization. . . . . . . . . . . . . . . . . . . . . . . . . 31

3.2 Visualbehavior fordoor passing . . . . . . . . . . . . . . . . . . . . . . . . 34

3.3 Visualbehaviors forollision-freenavigation . . . . . . . . . . . . . . . . . 36

3.3.1 Corridorentering . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3.2 Obstale avoidane by optialow . . . . . . . . . . . . . . . . . . 36

(6)

4 Global visual homing by visual servoing 43

4.1 Generalonept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.2 Virtual ameraplane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.3 Cameragaze ontrol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4.4 Visualnavigation ontrol . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.4.1 Controlby image Jaobian . . . . . . . . . . . . . . . . . . . . . . . 51

4.4.2 Controlwith imagemomentsand primitivevisual behaviors . . . . 53

4.4.3 Controlwith homography . . . . . . . . . . . . . . . . . . . . . . . 56

4.4.4 Experimentalresults . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.5 Comparisonof vision guidedand visual navigation. . . . . . . . . . . . . . 60

5 Loal visual servoing with generi image moments 63 5.1 Augmented pointfeatures . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.2 Generimoments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

5.2.1 Moments for rotation . . . . . . . . . . . . . . . . . . . . . . . . . . 66

5.2.2 Moments for translation . . . . . . . . . . . . . . . . . . . . . . . . 68

5.2.3 Coupling analysis of the sensitivity matrix . . . . . . . . . . . . . . 73

5.3 Positioningin 4DOF with augmented point features . . . . . . . . . . . . 74

5.3.1 Controlleroptimization . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.3.2 Simulationand experimentalresults . . . . . . . . . . . . . . . . . . 78

5.4 Positioningin simulationsin 6DOF with augmented point features . . . . 79

5.5 Alternative: Visualservoing on avirtual amera plane . . . . . . . . . . . 80

5.6 Analysisand onlusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

(7)

6.1 Stability analysis dependingon feature distribution . . . . . . . . . . . . . 88

6.2 Optimalreferene imageseletion . . . . . . . . . . . . . . . . . . . . . . . 91

6.2.1 Controlriteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

6.3 Navigation inthe imagespae . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.4.1 Navigation aross a spherewithin the virtual reality . . . . . . . . . 99

6.4.2 Navigation aross a semi ylinder with a5 DOFmanipulator . . . . 99

6.4.3 Navigation aross a uboid with a6 DOFmanipulator . . . . . . . 101

6.5 Alternative: Model-free pose estimation withloalvisualservoing . . . . . 103

6.6 Evaluationand onlusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

7 Conlusions and future work 111

A Analysis of the grid-based time to ontat from optial ow 115

B Analysis of the sensitivity matrix 119

Bibliography 123

Aknowledgements 138

(8)

The abbreviations used within the sope of this work are ordered alphabetially in the

following.

ARIA AdvanedRobotInterfae for Appliations

ARNL AdvanedRobotis Navigationand Loalizationsystem

a.u. arbitrary units

AUTOSAR AUTomotiveOpen SystemARhiteture

BRIEF Binary Robust Independent Elementary Features

CAD Computer-Aided Design

CMAES Controlled Model-AssistedEvolution Strategy

CV Current View

DBRVS Distane-Based Referene View Seletion

DOF Degree Of Freedom

DoG Dierene of Gaussian

EKF Extended KalmanFilter

FAST Features fromAelerated SegmentTest

FCRVS Fixed Convergene RefereneView Seletion

FSI Fixed Sale Interpolation

GFTT GoodFeatures ToTrak

GF-HOG GradientField-Histogramof Oriented Gradients

GLOH GradientLoationand OrientationHistogram

GV Goal View

HIL Hardware In the Loop

HOG Histogramof Oriented Gradients

IBVS Image-Based Visual Servoing

IR InfraRed

LQR Linear Quadrati Regulator

MAES Model-AssistedEvolution Strategy

NN Neural Network

ORB Oriented FAST and RotatedBRIEF

ORVS Optimal Referene View Seletion

PBVS Position-Based Visual Servoing

(9)

PD ProportionalDierential

PTZ Pan TiltZoom

RANSAC RANdom SAmpleConsensus algorithm

RMSE RootMean Square Error

ROS RobotOperating System

RV Referene View

SIFT Sale InvariantFeature Transformation

SII Sale InvariantInterpolation

SLAM Simultaneous LoalizationAnd Mapping

SNN Single NearestNeighbor

SURF Speeded Up Robust Features

ToF Time of Flight

tt time to ontat

VSLAM Visual SimultaneousLoalizationAndMapping

WANN Weighted Average among three NearestNeighbors

(10)

In the present work vetors and matries are printed in bold type. Vetors are hereby

displayed by minusule letters whereas matries are represented by apital letters, and

salars are expressed in itali style. The nomenlature is sorted as following: the rst

lassiationriterionislatin beforegreek letters,afterwards lower-asebeforeupper-ase

letters, and nallyboldbeforeitalitype.

a

ôntrolâtion ^(for âppearane ^based ^visual^servoing)

a

h ^saling ^fator ^(for homography)

a

_i

, b

_i ^distaneôfânînterest^point^toîtsappropriateepipolarlineorresponding to the

u

^- ^and

v

^-diretion, respetively

a

k ^pixel displaement

a

_m

, b

_m

, c

_m

, d

_m ^model^parameters ^for exponentialfuntion

A Hesse matrix

α

^rotation ^around^the

x

^-axis ^(roll)

α

_a ôrretion ^fator ^for^the âdaptive îmage^Jaobian

α

c

, α ˙

c ^amera ^pan ^angle, respetively veloity

α

ia

, β

ia

, γ

ia ^interior ^angles

α

u

, α

v întrinsi âmera ^parameter: ^saling ^fator ^depending ôn

λ

^and ^pixel ^di-

mensions

bCref ^image ^features ⁱⁿ^the ^referene^frame

β

y

^-axis^(pith)

β

c

, β ˙

c ^amera ^tilt^angle, respetivelyveloity

c

^performane ^riterion

conf

avg ^mean ^of ^the ^ondene ^values

conf

_seg(i,j) ôndene ^values ⁱⁿâ ^window^with^the ^rowând ôlumn^position

(i, j)

^of

the ell

C, C

n

, C

r âbsolute, ^normalizedând ^relative^numberôf^feature orrespondenes between the referene viewand the urrent image

C

ref

, C

α,β

, C

R ^stati ând ^rotated âmera ôordinate ^systems, respetively, and amera oordinate system inthe imageplane

C

V ^virtual ^amera ^oordinate^system, respetively virtual ameraplane CVi

i

^-th ^referene^view

(11)

d

_kp ^normalized ^keypoint ^desriptor ^of ^SIFT ^features

d

^distane

D

Dierene-of-Gaussian

∆f

êrror ^between ^desired ând âtual^feature ^loations

∆ ˆ f

^total ^normalized ^summed^feature ^error

∆f

γ ^orretion ^along

γ

^of ^the ^averaged ^keypoint ^rotation

∆f

ω

, ∆f

ω ^predited ^motion ôf ^the îmage ^features âused^by

∆Θ

R

∆ϕ

^feature êrror ^between ^referene ând ûrrent ^distortion ^(amera ^retreat

problem)

∆Θ

R orientationaltask spae error

∆x

^lateral ^task ^spae ^error

∆z

longitudinal taskspae error

[e

¹_a

, e

²_a

]

^T êpipoles ^from ^the âtual îmage

[ e

¹_ref

, e

²_ref

]

^T ^epipoles ^from ^the ^desired^view

E essentialmatrix desribing the epipolaronstraint

E(θ), ¯ E(φ), ¯ E(r) ¯

^mean âbsolute êrrorⁱⁿ âzimuth, êlevâtion ând ^radius

E

_u

, E

_v ^entropy ^along ^the

u

^- ^and

v

^-axis, respetively

ε

^residual êrror ^between ^model ând ^data ^point ^(for êrror ^funtion ôf ^the

M-estimator)

ε

_d dissimilarity(residualerror)

ε

γ êstimation êrror ^for âmera^rotation

η

1

, η

2 ^tuning ^variables

f

ûrrent îmage ^features, ^stated ^depending ôn ^the ôntext âs

f

_i

= [u

_i

, v

_i

]

for the

i

^-th^image ^feature^with ^oordinates

u

i

, v

i^, ⁱⁿ^the ^ontext ^of^SIFT

features as

f

_i

= [u

i

, v

i

, φ

i

, σ

i

]

^with ^the âdditional âttributes ôrienta-

tion

φ

_i ^and ^sale

σ

_i^, âlso ⁱⁿ ^the ôntext ôf îmage ^moments âs

f = [f

α

, f

β

, f

γ

, f

x

, f

y

, f

z

]

f

_ref ^referene îmage ^features, âlsoûsed ⁱⁿ^the ôntext ôf îmage ^moments

f

α ^image ^moment ^for ^rotation^around^the

x

^-axis

f

β ^image ^moment ^for ^rotation^around^the

y

^-axis

f

γ îmage ^moment ^for ^rotationâround^the ôptialâxis

f

x ^image ^moment ^for translationalong the

x

^-axis

f

y ^image ^moment ^for translationalong the

y

^-axis

f

z ^image ^moment ^for translationalong the amera axis

f

zd ^image ^moment^for translation alongthe ameraaxis, alternativeexpres- sion via the distane between pointfeatures

F

^ost ^funtion

G Gaussian lter

γ

z

^-axis ^(yaw), respetively the optialamera axis

γ

t ^angle ^between orientationof virtual amera planeand templateplane

γ

V ângle ^between ^the ^virtual âmera^plane ând ^the orientationof the robot

h

^twie ^the ^distane ^between ^the ^parabola's ^vertex ând ^the ^fous ôf ân

omnidiretionalamera

(12)

H

^,H

ˆ

homography, estimated homography by feature orrespondenes

H

_u

(i)

^relative ^frequeny ^of ^features ⁱⁿ

i

^-th^olumn

H

v

(i)

^relative ^frequeny ^of ^features ⁱⁿ

i

^-th^row

I urrent image, alsodenoted as

I(u, v, t)

ⁱⁿ ^dependene ^of ^the ^pixel^oor-

dinates

u, v

^and ^time

t I

ref ^referene ^image

[I

u

, I

v

]

^T ^spatial ^intensity ^gradient ⁱⁿ

u

^- ^and

v

^-diretion, respetively

J

^visual ^image^Jaobian

J

⁺ pseudoinverse of the imageJaobian

J

a ^Jaobian ^for ^appearane ^based ^visual^servoing

J

e ^Jaobian ^for ^visual^servoing ^on ^epipoles

J

_vω ^separated ^Jaobian ^for ^rotational^motion

J

_vt ^separated ^Jaobian ^for translationalmotion

J

_vξu_ξ ^separated ^Jaobian ^for ângleând âxisôf ^rotation parametrization

J

_xz ^separated ^Jaobian ^for translational motion, redued to two degrees of freedom

J

_dk ^robot ^Jaobian^for ^dierential ^kinematis

Jfi ^image ^Jaobian ^for^the ^image^momentⁱⁿ

i

^, ^whereas

i

^stands ^for

x

^,

y

^,

z

^,

α

^,

β

^,

γ

J

_f_i_,j îmage ^Jaobian êntry ^for ^the îmage ^moment ⁱⁿ

i

^with ^a ^movement ⁱⁿ

j

^, ^whereas ^both

i

^and

j

^stand ^for

x

^,

y

^,

z

^,

α

^,

β

^,

γ

^and

i = j

^(desired

ouplings)

J ˜

_f_i_,j îmage ^Jaobian êntry ^for ^the îmage ^moment ⁱⁿ

i

^with ^a ^movement ⁱⁿ

j

^,

whereas both

i

^and

j

^stand ^for

x

^,

y

^,

z

^,

α

^,

β

^,

γ

^and

i 6= j

^(undesired

ouplings)

J

_ω ^separated ^Jaobian ^for ^rotational^motion,^redued ^to^one ^degree ^of ^free-

dom

k

^onstant proportionalgain

k

_a ^adaptive ^gain

k

proportionalgain fator

K amera alibration matrix as afuntion of the intrinsi ameraparame-

ters

l

k ^image displaement

L Gaussian-blurred image

λ

^foal^length

λ

e ^evaluated individualsof

λ

^-CMAES

λ

eig ^eigenvalue

λ

i ^Lagrange ^multiplier

λ

p ^ospring ^of

λ

^-CMAES

µ

^ontrol^parameter ^for Levenberg-Marquardt optimization

µ

(i,j) ^meanôf^the^time^toôntat^valuesⁱⁿâ^segment^with^the^rowândôlumn

position

(i, j)

^of ^the ^ell

µ

p ^parents^of

λ

^-CMAES