-
Notifications
You must be signed in to change notification settings - Fork 5.6k
2017 12 27
-
Gradient Check of RNN
- [WIP] https://github.com/PaddlePaddle/Paddle/pull/7068
- Tensor::has_nan/has_inf
- Evaluator raise exception when NAN/Inf
-
Rename API of DevCtx
-
Polish
Scope::LocalVarNames
-
Speed up ColwiseSum in CPU
-
Rewrite AdamOp
-
Set RelWithDebInfo
Multi-device:
- add data layout
- add library type
- refine OpKernelType
- add memory switch mechanism in operator kernel switch
- cache memory in local scope
Fix and Enhance
- update support new device docs
- remove unused place
- remove unused usage_stat script: https://github.com/PaddlePaddle/Paddle/pull/6880
- unify the indentation of license: https://github.com/PaddlePaddle/Paddle/pull/7022
- refine CMakeLists.txt when add op need DEPS: https://github.com/PaddlePaddle/Paddle/pull/7067
- MKL
- update alexnet training data: https://github.com/PaddlePaddle/Paddle/pull/6878
- Add "download mklml failed" into FAQ: https://github.com/PaddlePaddle/Paddle/pull/7009
- code review:
- enable alexnet benchmark: https://github.com/PaddlePaddle/Paddle/pull/6852
- use small samples to infer openblas: https://github.com/PaddlePaddle/Paddle/pull/6755
- enable MKL Packed Recurrent Layer: https://github.com/PaddlePaddle/Paddle/pull/6719
- doc
- Complete refactor of
backward
- Update
DataFeeder
and inference model io according to users' feedback - Other improving and fixes:
- Reviews:
-
Update doc of V2 api
-
performance validation of understand_sentiment in fluid
https://github.com/PaddlePaddle/Paddle/pull/7004
https://github.com/PaddlePaddle/Paddle/issues/7046
-
Add gpu support for NCE_layer.
-
[WIP] Implement adaptive softmax.
-
Book.04 word2vec speed performance comparison with V2.
-
setup onnx environment and learned how it should interact with VisualDL
-
finished graph data design for graph in VisualDL
-
add edges to graph proto so that frontend can render more easily (WIP)
-
updated data format design for VisualDL
- Serialize and Deserialize SelectedRows, https://github.com/PaddlePaddle/Paddle/pull/7042
- BlockingCounter for ThreadPool, https://github.com/PaddlePaddle/Paddle/pull/7000
- Bug fix
- install python-tk, https://github.com/PaddlePaddle/Paddle/pull/7095
- PR Review:
- Profiling:
- Refine the activation type getting in the LSTM operator to speed.
- Speed data reader for IMDB dataset.
- Optimize the rowwise add function.
- Speed based on three statcked LSTM model:
- GPU: 166.95994s -> 87.30287s
- CPU: 385.2211s -> 294.90407s
- Benchmark Model:
- Make the ResNet of TensorFlow consistent with Paddle
- Mobile:
- Code Review:
- Implement ResNeXt for image classification
- Working on SENet [WIP]
-
Muiti Device
-
Code optimize
-
Review
- Doc:
- Polish accuracy doc: https://github.com/PaddlePaddle/Paddle/pull/7091
- Fix transpose op doc: https://github.com/PaddlePaddle/Paddle/pull/7020
- Models test:
- Use 'time' monitor resources while running train model
- Add script to analysis train log
- VGG16 performance comparison with TensorFlow
- Convergence comparison with TF on CPU
- Speed comparison with TF on CPU
- Internal convergence comparison on CPU and GPU
- Memory allocation comparison with TF
- Update and merge the VGG16 benchmark scripts
- Add the parsing part for the profiling tool
- Polish the doc of cross_entropy_op
- Fix two docs' problem
- Code Review:
PR
- Refine cos-sim-op
- Refine sgd-op
- Add conv2d_python doc
- Fix embedding example
Performance analysis: ResNet and VGG16
Review
- remove GPU Sync Interface
- Refine CUDA profiler and delete the test file
- Use for_range to rewrite adam
- Speed data reader for IMDB dataset.
- Optimize the rowwise add function
- Add vgg16 benchmark configuration
- detection_output op(for SSD, doing, code review
- norm op doing doing, code review
- run caffe ssd demo
- Framework
- Add a simple C++ inference example for fluid
- Mobile
- Always link protobuf-lite for mobile inference
- Multi box loss operator: https://github.com/PaddlePaddle/Paddle/pull/6946
- code review:
- run paddle v2 SSD demo
-
fluid
-
VisualDL with @longfei @daming
-
models ci with @haoshuang reviews https://github.com/PaddlePaddle/regtest/pull/8#pullrequestreview-85679209 https://github.com/PaddlePaddle/regtest/pull/9#pullrequestreview-85760472
- improve send/recv op:
- single thread block => async: https://github.com/PaddlePaddle/Paddle/compare/develop...gongweibao:asyncsendrecv?expand=1
- Fix bugs:
- create vars bugs: https://github.com/PaddlePaddle/Paddle/pull/7060
- Fix demo code bug in usage doc: https://github.com/PaddlePaddle/cloud/pull/535
- ISSUE:
- code review:
- refine distributed transpiler
- add scatter functors
- Fluid
- add DataType Transform
- Fix ThreadPool
- add multi kernel register
- add data layout in Tensor
- switch GPUPlace with CUDAPlace
- fix/copyfrom context
- refine op_kernel key
- remove GPU Sync interface
- switch Operaterbase Run with place/ Global DeviceContext
- fix/Place
- Benchmark - Reviews - https://github.com/dzhwinter/benchmark/pull/35 - https://github.com/dzhwinter/benchmark/pull/34
- PR & Review
- [optimized] https://github.com/PaddlePaddle/Paddle/pull/7034
- [multi-thread] https://github.com/PaddlePaddle/Paddle/pull/6751
- [CAPI-doc] https://github.com/PaddlePaddle/Paddle/pull/6596
- Add stacked dynamic lstm model for fluid
https://github.com/dzhwinter/benchmark/pull/34 - Add seq2seq model for tf
https://github.com/dzhwinter/benchmark/pull/31 - Add stacked dynamic lstm model for tf
https://github.com/dzhwinter/benchmark/pull/35 - Code Review
https://github.com/PaddlePaddle/Paddle/pull/6986#pullrequestreview-85519491
https://github.com/PaddlePaddle/Paddle/pull/6779#pullrequestreview-85241336