-
Notifications
You must be signed in to change notification settings - Fork 5.6k
2018 06 06
- Fix dependencies for test_paddle_inference_api_impl and build error when WITH_TESTING is OFF
- Fix CI build for paddlepaddle/PARL repo
- Fix documents:
- Point go package installation path into build directory
- Fix teamcity build to skip CI when only changing documents files
- face detection
- Add object_coverage constrain and fix label bug
- [WIP] Verification of face detection accuracy
- code review
- Add infer scripts: https://github.com/PaddlePaddle/models/pull/966
- Add head bbox for Pyramid-Box: https://github.com/PaddlePaddle/models/pull/963
- inference engine:
- add build and install document of fluid inference library: https://github.com/PaddlePaddle/Paddle/pull/11090
- rewrite unittest of trt_activation_op: https://github.com/PaddlePaddle/Paddle/pull/11222
- mkldnn:
- add ParallelDo CPU multi-thread training example for benchmark/fluid, fix test and flower dataset error and refine the codes:
- OCR CPU Inference:
- Newest MKLML library hang with the old static MKL library: Intel @[email protected] reproduce the issue, and report to MKL interenal team.
- provide cuda8+cudnn5+MKL_static libpaddle_fluid.so
- code review:
- MKLDNN:
- Mkldnn layout:https://github.com/PaddlePaddle/Paddle/pull/11040
- rename Mkldnn to MKLDNN: https://github.com/PaddlePaddle/Paddle/pull/11147
- MKLDNN:
-
PR
- [Merged] text_classification infer performance test https://github.com/PaddlePaddle/Paddle/pull/11080
- [Merged] mkldnn name https://github.com/PaddlePaddle/Paddle/pull/11147
- [Merged] enable infer api with multi-threads https://github.com/PaddlePaddle/Paddle/pull/11162
- [Merged] fix abort multi-threads https://github.com/PaddlePaddle/Paddle/pull/11233
- [WIP] Infer multi-threads API Demo and UT https://github.com/PaddlePaddle/Paddle/pull/11247
-
review code
- mkldnn layout https://github.com/PaddlePaddle/Paddle/pull/11040
- scope clean up https://github.com/PaddlePaddle/Paddle/pull/11243
- add python-opencv in latest https://github.com/PaddlePaddle/Paddle/pull/11242
-
issue
- [fixed] large memory when infer issue https://github.com/PaddlePaddle/Paddle/issues/11185
- [WIP] abort in multi-threads inference on CPU https://github.com/PaddlePaddle/Paddle/issues/11231
-
nlp performance report against online text_classification http://agroup.baidu.com/paddlepaddle/md/article/917068. We can get about 1.35X boost.
-
mkldnn feedback http://agroup.baidu.com/paddlepaddle/md/edit/942676.
- we are still trying to find a way use mkl_sequential
- They do not have any plan of Sparse Matrix yet.
- RNN is fairly done with gemm.
-
Fix and Optimized Checkpoint
-
Checkpoint On PaddleCloud
-
the init parameter "optimizer" of Trainer() should be a function
-
Definition of next steps
-
Code Review:
- NMT:
- Fix and enhance beam_search_op and beam_search_decode_op.
- Experiments on WMT14 en-de dataset.
- Compare with Tensor2Tensor and tune the model with new features (BPE data and weight sharing)
-
paddle fluid framework
- fix protobuf memory leak https://github.com/PaddlePaddle/Paddle/pull/11177
- add
host_memory_profiling_cn.md
https://github.com/PaddlePaddle/Paddle/pull/11191 https://github.com/PaddlePaddle/Paddle/pull/11208 https://github.com/PaddlePaddle/Paddle/pull/11212 - "change eigen mirror" https://github.com/PaddlePaddle/Paddle/pull/11240
- fix build error on mac https://github.com/PaddlePaddle/Paddle/pull/11134
- fix transpiler package https://github.com/PaddlePaddle/Paddle/pull/11087
- Fix compile error on mac caused by std move https://github.com/PaddlePaddle/Paddle/pull/11034
-
distributed trianing
- get data from wangsijiang@feed, start to implement deep & wide model.
-
AbacusToPaddle
- fix all compile problem, now two system can work together.
-
fengchao reinforcement learning with Paddle
- Discuss how to use VDL to debug their rl model
- discuss the usage of reshape op
- aws integration with CE merged
- NCCL2 support in progress
- team city restore
- Refine RecordIO reader with Yuyang
- Data preprocessing related job:
- image resize API: https://github.com/PaddlePaddle/Paddle/pull/11198
- reverse op: https://github.com/PaddlePaddle/Paddle/pull/11223
- [WIP] center crop: https://github.com/PaddlePaddle/Paddle/pull/11245
- bug fix:
- performance
- overlap memcpy in rpc op, experiment on vgg model imrprove 15% performance, https://github.com/PaddlePaddle/Paddle/pull/11221
- stablity
- dist mnist unit test, https://github.com/PaddlePaddle/Paddle/pull/11189
- sppedup test_listen_and_serv, https://github.com/PaddlePaddle/Paddle/pull/11126
- PR review
- checkpoint API, https://github.com/PaddlePaddle/Paddle/pull/10878#pullrequestreview-125431510
- support record in becnhmark, https://github.com/PaddlePaddle/Paddle/pull/11121#pullrequestreview-125452035
- Face Detection:
- Add head bbox for Pyramid-Box model. https://github.com/PaddlePaddle/models/pull/963
- Add infer scripts. https://github.com/PaddlePaddle/models/pull/966
- Debug the training.
- Others:
- API: https://github.com/PaddlePaddle/Paddle/issues/11246
- code review:
- box_coder_op:
- slice_op: https://github.com/PaddlePaddle/Paddle/pull/11052#pullrequestreview-125821114
- Prune dims supported by reduce op: https://github.com/PaddlePaddle/Paddle/pull/11113
- Clean up codes:
- Add rpc_client interface: https://github.com/PaddlePaddle/Paddle/pull/11154
- Add comment of grpc.tar.xz: https://github.com/PaddlePaddle/Paddle/pull/11153
- Fix
benmark/readme
bug: https://github.com/PaddlePaddle/Paddle/pull/10960 - Move sync_mode device ctx from grpc server: https://github.com/PaddlePaddle/Paddle/pull/10881
- Add brpc surpport
- Modify Pybind LoDTensor API according to length-based LoD:
- Modify lod tensor doc based on new LoDTensor Python API:
- Fix LodTensor API in memory opt machine translation example:
- Review
- https://github.com/PaddlePaddle/Paddle/pull/11170#pullrequestreview-125777639
- https://github.com/PaddlePaddle/Paddle/pull/11169#pullrequestreview-125759055
- https://github.com/PaddlePaddle/Paddle/pull/11167#pullrequestreview-125717589
- https://github.com/PaddlePaddle/Paddle/pull/11166#pullrequestreview-125749672
- https://github.com/PaddlePaddle/Paddle/pull/11130#pullrequestreview-125335601
- Tape Prototype: https://github.com/PaddlePaddle/Paddle/pull/11019
- Inference
-
high level API
- merged, simplify inference api
- merged, inference API little fix
- merged, feature/simple inference demo
- merged, Feature/anakin embed
-
sub-graph related
- open, feature/trt engine op test
- merged, Feature/fc converter
- merged, feature/tensorrt engine op
- add merged, dfg graphviz pass
-
enhancement && bugfix
- WIP, add replace enforce with glog switch for debug
- merged, fix manylinux compile error caused by inference lib
- merged, clean docstring_checker.pyc
- merged, fix compile error
-
https://github.com/PaddlePaddle/Paddle/pull/11090#pullrequestreview-125044331 https://github.com/PaddlePaddle/continuous_evaluation/pull/59#pullrequestreview-125044827 https://github.com/PaddlePaddle/continuous_evaluation/pull/64#pullrequestreview-126628433
- Train ImageNet On 64 GPUs with LARS, data prepare (preprocess op?)
- fluid_benchmark update: https://github.com/PaddlePaddle/Paddle/pull/11121
- Trainer send complete signal: https://github.com/PaddlePaddle/Paddle/pull/11220
- Refine RPC client sync wait: https://github.com/PaddlePaddle/Paddle/pull/11132
- small fixes and reviews
- TODO: New EDL design doc
-
PR
- [WIP]SE-ResNeXt-152 multi card acceleration ratio tuning process
- Fuse AllReduce Operator
- [Feature] Add fuse vars op handle
- Refine fluid_benchmark.py
- Balance parameter opt
- Drop the last batch, if the size of last batch is not equal to batch_size.
- Add resnet 50
-
Review
- Fluid benchmark support recordio reader
- SSA Graph Builder Factory
- Add image_resize_short and refine resize API
- Pull requests
- issues
- [WIP]Add argmin and argmax ops
-
DeepASR: 1) Acoustic model training; 2)Adapt decoder to the new net config
-
Transformer: model training (with@guosheng)
-
Dectection: Add Argsort Op
-
ONNX convertor: Merge compare ops & several relu ops
Code Review:
- Speed up read RecordIO
- Refactor ParallelExecutor to factory
- Fuse AllReduceOp
- Release 0.13.0
- Debug memory leak
- Debug distributed train hang
- fluid_benchmark.py
- Scope clean up
-
CE frame
- support run modified models of CE
- CE web problem
- CE document (CE onduty and Hi alarm)
-
Teamcity CI server & db fault tolerance
-
CE resnet add multi card, speedup testing on P40(doing)
-
paddle code scan(c plus and python)
- memory optimize
- static ssa graph convert and optimize
- model non-determinstic/reproducible
- check op in http://agroup.baidu.com/paddlepaddle/view/office/946408
- compatibale with cudnn5
- fix cudnn non-determinstic
- try to fix non-determinstic issue with @Goyal
- Paddle AMD device support
- accelerate amd device support
- solve the save algorithm issue in conv/conv_grad
- Adapt ModelAverage to latest high level api.
- Prune dims supported by reduce op.
- Rewrite OCR CTC model by latest high level api.[WIP]
-
nmt
- running transformer on paddle cloud with multi-machines,with some problems:
- stability: cored when trainer num is large,for example 8,16
- precision: avg loss is 3.5+, the best is 2.5 when trained wite multi-cards on one machine
- old version:only one card multi-machines
- move newest version of transformer to paddle cloud
- one bug to be fixed:gpu release verson of paddle won't assert when embbeding index out of range
- running transformer on paddle cloud with multi-machines,with some problems:
-
abacus2paddle
- paddle cloud can afford 100+ machine for ctr trainning test
-
High level API:
- PR: Modify optimizer in new API: https://github.com/PaddlePaddle/Paddle/pull/11168
- PR: Fix optimizer: https://github.com/PaddlePaddle/Paddle/pull/11172
- PR: Label semantic roles book example: https://github.com/PaddlePaddle/book/pull/540
- PR: sentiment analysis book example: https://github.com/PaddlePaddle/book/pull/539
- Review: Recommendation system book example: https://github.com/PaddlePaddle/Paddle/pull/11252
- Review: Compare results for data results: https://github.com/PaddlePaddle/book/pull/536
- Review: Update MNIST book with new optimizer: https://github.com/PaddlePaddle/book/pull/535
- Review: recommender system example: https://github.com/PaddlePaddle/book/pull/526
- Review: Image classification book example: https://github.com/PaddlePaddle/book/pull/533
- Review: LoDTensor API change: https://github.com/PaddlePaddle/Paddle/pull/11171
- Review: Recognize digits example: https://github.com/PaddlePaddle/book/pull/529
-
Fix non-determinism in Paddle CUDA kernels:
- PR: Sparse sgd without atomicAdd: https://github.com/PaddlePaddle/Paddle/pull/11229
- PR: Non-determinism in Sentiment analysis implementation: https://github.com/PaddlePaddle/Paddle/pull/11133
-
Others:
- PR: Fix signed-unsigned: https://github.com/PaddlePaddle/Paddle/pull/11167
-
Finished 2 chapters in book following the new Fluid API
-
Found a few issues while working on the chapter re-writing
-
Reviewed PRs:
- PR:
- Recognize digit example updated with high level api draft: https://github.com/PaddlePaddle/book/pull/528
- Recognize digit example train script updated and draft 2: https://github.com/PaddlePaddle/book/pull/529
- Recognize digit Chinese Markdown update: https://github.com/PaddlePaddle/book/pull/531
- Recognize digit Update MNIST to use optimizer_func: https://github.com/PaddlePaddle/book/pull/535
- Image Classification train.py: https://github.com/PaddlePaddle/book/pull/533
- Issues: https://github.com/PaddlePaddle/book/issues/527 https://github.com/PaddlePaddle/VisualDL/issues/459
- PR:
- Recommendation System Book chapter 5 with high level api code and documentation: https://github.com/PaddlePaddle/book/pull/526
- Second draft of Recommendation System Book https://github.com/PaddlePaddle/Paddle/pull/11252
- Update High level API test of Recommendation System https://github.com/PaddlePaddle/Paddle/pull/11252
- Reviews:
- Updates to handle language and version switching, fully working menu editor on PaddlePaddle.org: https://github.com/PaddlePaddle/PaddlePaddle.org/pull/481
- Testing paddle-onnx on TensorRT issues for nGraph-ing