[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: add a handful of acquire/release atomics (make the internal runtime versions public) #70607

Open
eloff opened this issue Nov 28, 2024 · 5 comments
Labels
Milestone

Comments

@eloff
Copy link
eloff commented Nov 28, 2024

Proposal Details

I asked for this a decade ago, and I want to start this discussion up again.

I'm not saying Go should add the full gamut of relaxed, acquire, release, etc orderings available in other languages.

But I am saying it's useful to have some basic acquire and release loads and stores for 32bit, 64bit, and uintptr.

To see that this is useful one need look no further than the Go runtime which use exactly these.

uintptr:
internal/runtime/atomic.LoadAcquintptr
internal/runtime/atomic.StoreReluintptr

uint32:
internal/runtime/atomic.LoadAcq
internal/runtime/atomic.StoreRel

uint64:
internal/runtime/atomic.LoadAcq64
internal/runtime/atomic.StoreRel64

I would like that we make these 6 functions available in the atomics package and a LoadAcq and StoreRel on each of the atomic types. I can make a PR.

It's not that these are so advanced and so dangerous that only Go runtime authors should have them. We deserve advanced tools too. For a language with concurrency as a strength, it's lacking in some important primitives to develop performant concurrent data structures. And it's not that we can't have that in Go, it's already implemented, it just isn't public.

@eloff eloff added the Proposal label Nov 28, 2024
@gopherbot gopherbot added this to the Proposal milestone Nov 28, 2024
@gabyhelp
Copy link

Related Issues

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

@randall77
Copy link
Contributor

As far as I can tell the current *Rel and *Acq versions only differ from the standard versions on ppc64.
Is that your understanding as well?

@eloff
Copy link
Author
eloff commented Nov 28, 2024

Oh, you mean they don't have proper acquire or release semantics on other platforms? Interesting. Could we have that for the runtime and the atomics library?

My understanding is the the standard Go atomic promise sequential consistency, which require a lock prefix or fence on x86/64 processors. The documentation states "This definition provides the same semantics as C++'s sequentially consistent atomics". Since x86 mov instructions have acquire/release semantics, you need a fence or stronger instruction (I believe it's common to abuse xchg instruction for this).

@Jorropo
Copy link
Member
Jorropo commented Nov 28, 2024

The intrinsics are only implemented for PPC64:

addF("internal/runtime/atomic", "LoadAcq",
func(s *state, n *ir.CallExpr, args []*ssa.Value) *ssa.Value {
v := s.newValue2(ssa.OpAtomicLoadAcq32, types.NewTuple(types.Types[types.TUINT32], types.TypeMem), args[0], s.mem())
s.vars[memVar] = s.newValue1(ssa.OpSelect1, types.TypeMem, v)
return s.newValue1(ssa.OpSelect0, types.Types[types.TUINT32], v)
},
sys.PPC64)
addF("internal/runtime/atomic", "LoadAcq64",
func(s *state, n *ir.CallExpr, args []*ssa.Value) *ssa.Value {
v := s.newValue2(ssa.OpAtomicLoadAcq64, types.NewTuple(types.Types[types.TUINT64], types.TypeMem), args[0], s.mem())
s.vars[memVar] = s.newValue1(ssa.OpSelect1, types.TypeMem, v)
return s.newValue1(ssa.OpSelect0, types.Types[types.TUINT64], v)
},
sys.PPC64)

addF("internal/runtime/atomic", "StoreRel",
func(s *state, n *ir.CallExpr, args []*ssa.Value) *ssa.Value {
s.vars[memVar] = s.newValue3(ssa.OpAtomicStoreRel32, types.TypeMem, args[0], args[1], s.mem())
return nil
},
sys.PPC64)
addF("internal/runtime/atomic", "StoreRel64",
func(s *state, n *ir.CallExpr, args []*ssa.Value) *ssa.Value {
s.vars[memVar] = s.newValue3(ssa.OpAtomicStoreRel64, types.TypeMem, args[0], args[1], s.mem())
return nil
},
sys.PPC64)

Looking through the .s and .go impl files here are the edge cases:

  • Load Acquire on amd64 is implemented without atomic instructions, but so does any atomic load on amd64 because of it's strong memory model.
  • wasm's impls use non atomic go primitives, but it also lack SMP.

Otherwise they are TCO wrappers around the sequentially consistent atomics.

note: the compiler internally use theses functions calls to disable certain optimizations and various other things, because the implementation of atomic uses non atomic operators does not mean it is safe to do so outside of theses exact functions.

If anything I guess this is because no one found a compelling use for theses, we can implement relaxed store and acquire load on most architectures people would care about if this proposal required it, for others fallback to sqcst impls as we currently do.

@eloff
Copy link
Author
eloff commented Nov 28, 2024

Interesting, so go just uses a plain mov on x86/64 for atomic load, which actually is the same as what C++ does.

It implements the Store with XCHG, which is also what C++ does.

So then the question is, if we implement the release store with mov instead of xchg on x86/x64, does it matter? It would be faster, but in an impactful way? I would guess it matters a lot more on looser architectures like arm.

Go has an extensive benchmark suite, and it uses StoreRel in some parts of the runtime. So I guess we could test this for the Go runtime pretty easily.

It'd be nice to test it for a synthetic benchmark too, to see how much of a difference there is in theory. I could start with that.

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

5 participants