The original algorithm was suffering from an ABA race condition:
A: fp = page_stack
B: completely allocates the same page and writes into it some data
A: unsuspecting, loads (invalid) next = fp->next
B: finishes working with the page and returns it back to page_stack
A: compare-exchange page_stack: fp => next succeeds and writes garbage
to page_stack
Fixed this by using an implicit spinlock in hot page allocator.