boosting machine Transformer
boosting based ONNX implementation for classification classification.
- Input
- 2277-dim embedding
- Encoder
- 60 x Transformer with 40 heads
- Output
- accuracy projection
Training config
optimizer=SGD, lr=0.989, scheduler=cyclic, warmup=183