* albert flax * year -> 2021 * docstring updated for flax * removed head_mask * removed from_pt * removed passing attention_mask to embedding layer