IROS2019國際學(xué)術(shù)會議論文集 1524_第1頁
IROS2019國際學(xué)術(shù)會議論文集 1524_第2頁
IROS2019國際學(xué)術(shù)會議論文集 1524_第3頁
IROS2019國際學(xué)術(shù)會議論文集 1524_第4頁
IROS2019國際學(xué)術(shù)會議論文集 1524_第5頁
免費(fèi)預(yù)覽已結(jié)束,剩余1頁可下載查看

付費(fèi)下載

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

End to End Driving Model for Steering Control of Autonomous Vehicles with Future Spatiotemporal Features Tianhao Wu1 Ao Luo1 Rui Huang1 Member IEEE Hong Cheng1 Senior Member IEEE Yang Zhao1 Abstract End to end deep learning has gained considerable interests in autonomous driving vehicles in both academic and industrial fi elds especially in decision making process One critical issue in decision making process of autonomous driving vehicles is steering control Researchers has already trained different artifi cial neural networks to predict steering angle with front facing camera data stream However existing end to end methods only consider the spatiotemporal relation on a single layer and lack the ability of extracting future spatiotemporal information In this paper we propose an end to end driving model based on Convolutional Long Short Term Memory Conv LSTM neural network with a Multi scale Spa tiotemporal Integration MSI module which aiming to encode the spatiotemporal information from different scales for steer ing angle prediction Moreover we employ future sequential information to enhance spatiotemporal features of the end to end driving model We demonstrate the effi ciency of proposed end to end driving model on the public Udacity dataset with comparison of some existing methods Experimental results show that the proposed model has better performances than other existing methods especially in some complex scenarios Furthermore we evaluate the proposed driving model on a real time autonomous vehicle and results show that the proposed driving model is able to predict the steering angle with high accuracy compared to skilled human driver Index Terms End to End Driving Model Future Spatiotem poral Features Multi Scale Spatiotemporal Integration Module Convolutional LSTM I INTRODUCTION Autonomous driving techniques have gained considerable interests in both academia and industrial R 2 Future sequential information is employed in the train ing process of the driving model which aiming to enhance spatiotemporal features 3 The proposed end to end driving model has been tested on both public Udacity dataset and a real time au tonomous vehicle The proposed end to end driving model is validated on the public Udacity dataset with comparison of some existing 2019 IEEE RSJ International Conference on Intelligent Robots and Systems IROS Macau China November 4 8 2019 978 1 7281 4003 2 19 31 00 2019 IEEE950 Fig 1 The architecture of proposed end to end driving model methods Furthermore we also evaluate the proposed driving model on a real time autonomous vehicle in our campus with a collected UESTC dataset Experimental results show that the proposed end to end driving model has better perfor mance than existed end to end methods and could achieve good steering angle prediction with high accuracy compared to skilled human driver on real time autonomous vehicle The reminder of this paper is organized as follows Section II introduces the proposed end to end driving model with details of the MSI module and the training process of future spatiotemporal features In Section III experimental results on both the public Udacity dataset and a real time autonomous vehicle are presented and discussed This paper ends with conclusions and future work in Section IV II METHOD This section presents the methodology details of the pro posed end to end driving model Section II A lays down the architecture of the proposed end to end driving model which combine an MSI module with Conv LSTM Then the training process of the proposed model are introduced in Section II B in which the future sequential information are employed to enhance future spatiotemporal features A Architecture of the End to End Driving Model The architecture of proposed end to end driving model is illustrated in Fig 1 As shown in Fig 1 past n frames from frame t n 1 to frame t are set as inputs of proposed driving model The proposed driving model can be divided into two parts the feature extracting network and the steering angle prediction network The feature extracting network consists of an encoder 6 convolution layers and an MSI module with Conv LSTM As shown in Fig 1 the proposed MSI module is combined with 4 spatiotemporal modules and each module has 3 con volution layers and 1 convolutional LSTM Spatiotemporal modules are successively added on top of 3rd 4th 5th and 6th layers of the encoder respectively which aiming to encode the spatiotemporal information from different scales In each spatiotemporal module the fi rst convolution layer is designed to fi lter the redundant spatial information of input features then the convolutional LSTM is employed to generate the temporal information of past n frames other two convolution layer are aiming to obtain key features for steering angle prediction After each spatiotemporal module we employ a fully connected layer to regulate the dimension and then merge the extracted spatiotemporal feature with each other from different scales With the obtained spatiotemporal features fully connected layers are utilized to fi nally predict steering angles from current time step t We also exploit ground truths of time step t 1 to t k to guide the network for training each predictor is combined with fully connected layers but do not use them to make predictions more details are provided in next subsection Table I gives output sizes of all layers of the proposed end to end driving model The size of input images is 3 480 640 Corresponding to Fig 1 we give a short name in Table I for all layers For the encoder module 6 convolution layers are numbered from left side to the right Conv 1 to Conv 6 Four spatiotemporal branches of the MSI module are numbered from bottom to the top Scale 1 to Scale 4 After designing the architecture of proposed end to end driving model the training details for network optimization will be given in the next subsection 951 TABLE I OUTPUT SIZES OF ALL LAYERS OF THE PROPOSED END TO END DRIVING MODEL ModuleNameOutput sizeNameOutput size Encoder Conv 124 238 318Conv 236 117 157 Conv 348 57 77Conv 464 28 38 Conv 576 13 18Conv 698 6 8 MSI Module Conv 1 124 57 77Covn 2 132 28 38 Conv LSTM 124 57 77Conv LSTM 232 28 38 Conv 1 224 28 38Conv 2 232 13 18 Conv 1 312 28 38Conv 2 316 13 18 FC 13744FC 2912 Conv 3 138 13 18Covn 4 149 6 8 Conv LSTM 338 13 18Conv LSTM 449 6 8 Conv 3 238 6 8Conv 4 249 2 3 Conv 3 319 6 8Conv 4 324 2 3 FC 3144FC 464 PredictorFC 116FC 21 B Spatiotemporal Features Enhancement with Future Se quential Information With the designed end to end driving model with MSI module future sequential information is utilized to enhance spatiotemporal features during the training process Fig 2 shows the training process of the driving model with future sequential information As depicted in Fig 2 past n frames from frame t n 1 to current frame t of camera images are set as the input of the driving model Fig 2 The training process of the end to end driving model In order to enhance spatiotemporal features of proposed driving model the ground truth of steering angle from current frame t to the kth frame after current frame The fi nal cost function is designed as follows J Loss t k i 1 i Loss t i 1 where Loss t and Loss t k denote the loss of angle prediction at time step t and t k respectively i 1 i i 1 k are weight parameters of loss at different time steps In this paper we adopt a simple form of squared loss as follows Loss t 1 N N n 1 st n st n 2 2 where N indicates the number of samples for model updating each time st ndenotes the learned model s prediction at time t with sample n and st nis ground truth of steering angle With the designed cost function described in Eqn 1 the model can be trained through back propagation Note that the ground truth of t 1 to t k frames are auxiliary labels for training and after training the corresponding predictions are not used in testing procedure Once trained the end to end driving model can generate steering a single steering angle i e the steer angle of time step t from camera images of every past n frames Fig 3 shows this confi guration Fig 3 The trained end to end driving model is utilized to generate a single steering angle from camera images of every past n frames III EXPERIMENTS In this section we fi rstly validate the proposed end to end driving model on the public Udacity dataset with comparison of some existing end to end methods Then we also validate the proposed driving model on a real time autonomous vehicle in our campus with collecting a UESTC dataset Next two subsections will express the experimental results and discussions in detail A Experiments on Udacity dataset 1 The Udacity Dataset The Udacity dataset is originally provided for a series of self driving challenges 13 In this paper we adopt a subset of the Udacity dataset Udacity Challenge II for experimental purpose The Udacity Challenge II dataset contains total 33808 frames for model training and 5614 frames for model testing in which vehicle speed torque steering angle and video streams from three front view cameras are recorded The resolution of images in Udacity is 480 640 2 Experiment Details The experiments are implemented on a workstation with 4 GeForce GTX Titan GPUs All code is written and implemented under the pytorch framework We randomly sampling 15 of the training data for validating models and always choose the best model on this validation set The number of past frames n is chosen as 10 ADAM is utilized as the optimizer The initial learning rate is set as 1 10 4for experiments on Udacity dataset In order to elevate the generalization ability of end to end driving mod els we employ a widely adopted data augmenting scheme on Udacity dataset by mirroring images 12 952 3 Experimental Results and Analysis In experiments on the public Udacity dataset we fi rstly evaluate the proposed end to end driving model with different future sequential information In this experiment the proposed end to end model without future sequential information is set as the baseline of our model called MSINet in experiments We evaluate both the MSINet and MSINet with future t k frames For models with future frames we set the steering angle prediction of future frames as side tasks of our model during the training process Considering the computational cost for model training we set the number of future frames 1 k 5 Table II gives results comparison of the MSINet and MSINet with future t k frames In this paper the Root Mean Square Error RMSE is utilized as the evaluation index of driving models TABLE II EXPERIMENTAL RESULTS OF THE PROPOSED DRIVING MODEL ON THE UDACITY CHALLENGE IIDATASET ModelRMSE rad MSINet0 0613 with t 1 frame0 0574 with t 2 frames0 0545 with t 3 frames0 0519 with t 4 frames0 0491 with t 5 frames0 0504 As shown in Table II experimental results indicate that models involved future sequential information gained better steering angle prediction than the MSINet From the results we can see MSINet with t 4 frames achieve the best steering angle prediction than other models on the Udacity challenge II dataset RMSE is 0 0491 rad Fig 4 Comparison of the MSINet and the MSINet with t 4 frames in three curve road situations For further analysis the proposed driving model with future spatiotemporal features which enhanced by future sequential information we analysis performances of our models in different road scenarios Finally we found that the proposed driving model with future spatiotemporal features has better steering prediction in curve road situations Fig 4 shows comparison of the MSINet and the MSINet with t 4 frames in three curve road situations a b c Fig 5 Comparison of steering angle prediction between the MSINet and the MSINet with t 4 frames when passing curve roads As depicted in Fig 4 the MSINet with t k frames achieves better steering angle prediction than the MSINet in curve road situations We also extract the results of steering angle prediction in passing curve road situations Fig 5 illustrate the comparison of steering angle prediction between the MSINet and the MSINet with t 4 frames when passing curve road Three fi gures in Fig 5 a b c are corresponded with three curve road situations of Fig 4 situations from top to bottom The results depicted in Fig 5 show that the MSINet with future spatiotemporal features t 4 frames has better performance when passing curve roads In experiments of the Udacity dataset we also compare our model with some existed end to end methods We fi rstly 953 Fig 6 Example video frames in the UESTC dataset reproduced CgNet NVIDIA s PilotNet and ST LSTM net work then train these models on the training set of Udacity Challenge II dataset Brief introduction of these methods for comparison are given as follows 1 CgNet The CgNet is published as a baseline for Udacity Challenge2 14 which combined with 3 convolution layers and a fully connected layer 2 PilotNet The PilotNet is proposed by NVIDIA 8 which combined with 5 convolution layers and 4 fully connected layers 3 ST LSTM network The ST LSTM network is pro posed by Chi et al 12 which combine spatiotemporal convolution layers with LSTM Table III gives the comparison of steering angle prediction between our models and other end to end driving models Experimental results in Table III show that our models have better performances than other end to end driving models on the public Udacity Challenge II dataset TABLE III COMPARISON OF STEERING ANGLE PREDICTION BETWEEN OUR MODEL AND OTHER END TO END DRIVING MODELS ModelRMSE rad CgNet0 1779 NVIDIA s PilotNet0 1589 ST LSTM Network0 0622 MSINet0 0631 MSINet with t 4 frame0 0491 B Experiments on a Real time Autonomous Vehicle In this subsection the proposed end to end driving model is evaluated on a real time autonomous vehicle within our campus Firstly we will give a brief introduction of the dataset collection of our campus which called the UESTC dataset Then the driving system with a real time autonomous vehicle will be introduced with some experimental details of model training process Results and discussions are given at the end of this subsection 1 The UESTC Dataset In order to training our model for road testing in our campus we make a driving dataset called UESTC dataset with total 30878 frames of images and steering angles The resolution of each image is 1280 1024 with 15 frames per second sampling rate Fig 6 shows some example video frames of the collected UESTC dataset With the collected UESTC dataset the training process of the driving model is set as the same as on Udacity dataset Input images of training process are resized to 3 480 640 2 The Driving System of Autonomous Vehicle The driv ing system of our autonomous vehicle is built based on the Robot Operation System ROS Fig 7 shows the au tonomous vehicle and the block diagram of our driving system As shown in Fig 7 b two ROS nodes green blocks are implemented in our driving system where the driving model is embedded in the steering angle prediction node Inputs of the prediction node are image stream from front view camera and outputs the prediction value of steering angles Another CAN analysis node is designed to convert predicted steering angles to control messages The driving system is implemented on the on board computing platform with a single GeForce GTX Titan GPU The driving system is built depend on an FAW A70E electric car 3 Results and Discussions In the experiment on the real time autonomous vehicle we choose the MSINet with t 4 frame for model training and on road testing For the safety consideration the whole experiment last about 50 minutes and the vehicle driving about 4 2km average speed is 5km h in our campus As the same as in experiments on Udacity we calculate the RMSE between the steering angle of the prediction of our model and human driver with the 954 a b Fig 7 The autonomous vehicle with the driving system a The autonomous driving system b Block diagram of the driving system result of 0 0544 rad Fig 8 shows a fragment of the whole experiment with comparison of our model s prediction with the human driver As shown in Fig 8 our model has the ability to achieve good prediction compared with the human driver Fig 8 The autonomous vehicle with the driving system IV CONCLUSIONS ANDFUTUREWORK This paper has proposed a novel end to end driving model for steering angle prediction of autonomous vehicle The proposed driving model is designed based on the Conv LSTM neural network with combining an MSI module for encoding the spatiotemporal information on multiple layers In order to enhance spatiotemporal features of the driving model for better steering angle prediction we employ future sequential information in the model training process The performance of proposed driving model has been validated on both public Udacity dataset and a real time autonomous vehicle Experimental results show that our model has better performance than other existing methods on the public Udacity dataset and achieve good steering angle prediction on a real time autonomous vehicle testing In the future we will improve our model for smooth s teering control which aiming to improve the vehicle stability with higher speed Moreover the visualization of proposed model will be considered in the future work for improving the performance of driving model ACKNOWLEDGMENT This work was made possible by support from the Na tional Key Research and Development Program of Chi na 2017YFB1302300 2017YFB0102603 National Natural Science Foundation of China NSFC No 6150020696 61503060 and the Fundamental Research Funds for the Central Universities No ZYGX2015J148 REFERENCES 1 M Montemerlo J Becker S Bhat et al Junior The Stanford Entry in the Urban Challenge Journal of Field R

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論