Implement sched_[gs]etaffinity()

Fri Apr 26 08:53:00 GMT 2019

On Apr 26 01:44, Mark Geisert wrote:
> On Wed, 17 Apr 2019, Corinna Vinschen wrote:
> > On Apr 16 21:31, Mark Geisert wrote:
> > > On Tue, 16 Apr 2019, Corinna Vinschen wrote:
> > > > On Apr 16 01:19, Mark Geisert wrote:
> > > > >   Anybody know if one can
> > > > > depend on the group membership of the first processor group to apply to all
> > > > > groups?
> > > > 
> > > > Maybe https://go.microsoft.com/fwlink/p/?linkid=147914 helps?
> > > > 
> > > > "If the number of logical processors exceeds the maximum group size,
> > > >  Windows creates multiple groups by splitting the node into n groups,
> > > >  where the first n-1 groups have capacities that are equal to the group
> > > >  size."
> > > 
> > > Great; thanks for that.
> > > 
> > > > [...]
> > > > Therefore:
> > > > 
> > > >  WORD cpu_group = cpu_number / num_cpu_per_group;
> > > >  KAFFINITY cpu_mask = 1L << (cpu_number % num_cpu_per_group);
> > > > 
> > > > That also means the transposition between the groupless linux system
> > > > and the WIndows system is fairly easy.
> > > 
> > > Yes, dealing with an array of unsigned longs vs bitblt ops FTW.
> 
> I've been doing research to more fully understand the non-symmetric API for
> Windows affinity ops.  I came across a non-MS document online that discusses
> affinity on Windows with >64 CPUs.  The author works on "Process Lasso", a
> product that attempts to balance performance of apps across CPUs.
> 
> Anyway, he says processors are divided evenly among groups.  One reason for
> this is that Windows allocates new processes round-robin among the processor
> groups.  This won't balance properly if some groups have more processors
> than other groups.  Here's a link to the doc:
> https://bitsum.com/general/the-64-core-threshold-processor-groups-and-windows/
> 
> I'm not trying to muddy the waters, I'm just trying to figure out if there
> are different processor group assignment methods for different kinds of
> systems, SMP vs NUMA for instance.

That's what the __get_cpus_per_group function in miscfuncs.cc is for, so
you know the number of CPUs per group, and the transposition from
grouped vs. linear representation and vice versa is no problem.

The non-NUMA vs. NUMA problem is just some under the hood design which
tries to keep closely related nodes together if possible.

> I don't think the code I've got is robust enough to submit yet.  I suppose I
> could ship what should work, i.e., single-group processes and threads and
> just return ENOSYS for multi-group ops.  Or just hold off 'til done.

Nah, no worries.  We're in no hurry.

Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin-developers/attachments/20190426/4a7bdcce/attachment.sig>