ABSTRACT Implementations of map-reduce are being used to perform many,operations on very large data. We explore alternative ways that a system could use the environment and capa- bilities of map-reduce implementations such as Hadoop. In particular, we look at strategies for combining the natural join of several relations. The general strategy we employ is to identify certain attributes of the multiway join that are part of the "map-key," an identifler for a particular Reduce process to which the Map processes send tuples. Each at- tribute of the map-key gets a "share," which is the number of buckets into which its values are hashed, to form a com- ponent of the identifler of a Reduce process. Relations have their tuples replicated in limited fashion, the degree of repli- cation depending on the shares for those map-key attributes that are missing from their schema. We study the problem of optimizing the shares, given a flxed product (i.e., a flxed number,of Reduce processes). An algorithm for detecting and flxing problems where a variable is mistakenly included in the map-key is given. Then, we consider two important special cases: chain joins and star joins. In each case we are able to determine the map-key and determine the shares that yield the least amount,of replication.