[go: up one dir, main page]

Skip to content

Semantic segmentation task for ADE20k & cityscapse dataset, based on several models.

Notifications You must be signed in to change notification settings

hellochick/semantic-segmentation-tensorflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

semantic-segmentation-tensorflow

This is a Tensorflow implementation of semantic segmentation models on MIT ADE20K scene parsing dataset and Cityscapes dataset   We re-produce the inference phase of several models, including PSPNet, FCN, and ICNet by transforming the released pre-trained weights into tensorflow format, and apply on handcraft models. Also, we refer to ENet from freg856 github. Still working on task integrated.

Models

  1. PSPNet
  2. FCN
  3. ENet
  4. ICNet

...to be continue

Install

Get corresponding transformed pre-trained weights, and put into model directory:

FCN PSPNet ICNet
Google drive Google drive Google drive

Inference

Run following command:

python inference.py --img-path /Path/To/Image --dataset Model_Type

Arg list

--model - choose from "icnet"/"pspnet"/"fcn"/"enet"  

Import module in your code:

from model import FCN8s, PSPNet50, ICNet, ENet

model = PSPNet50() # or another model

model.read_input(img_path)  # read image data from path

sess = tf.Session(config=config)
init = tf.global_variables_initializer()
sess.run(init)

model.load(model_path, sess)  # load pretrained model
preds = model.forward(sess) # Get prediction 

Results

ade20k

Input Image PSPNet FCN

cityscapes

Input Image ICNet ENet

Citation

@inproceedings{zhao2017pspnet,
  author = {Hengshuang Zhao and
            Jianping Shi and
            Xiaojuan Qi and
            Xiaogang Wang and
            Jiaya Jia},
  title = {Pyramid Scene Parsing Network},
  booktitle = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2017}
}

Scene Parsing through ADE20K Dataset. B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso and A. Torralba. Computer Vision and Pattern Recognition (CVPR), 2017. (http://people.csail.mit.edu/bzhou/publication/scene-parse-camera-ready.pdf)

@inproceedings{zhou2017scene,
    title={Scene Parsing through ADE20K Dataset},
    author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
    year={2017}
}

Semantic Understanding of Scenes through ADE20K Dataset. B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso and A. Torralba. arXiv:1608.05442. (https://arxiv.org/pdf/1608.05442.pdf)

@article{zhou2016semantic,
  title={Semantic understanding of scenes through the ade20k dataset},
  author={Zhou, Bolei and Zhao, Hang and Puig, Xavier and Fidler, Sanja and Barriuso, Adela and Torralba, Antonio},
  journal={arXiv preprint arXiv:1608.05442},
  year={2016}
}