Base station clustering is necessary in large interference networks, where the channel state information (CSI) acquisition overhead otherwise would be overwhelming. In this paper, we propose a novel long-term throughput model for the clustered users which addresses the balance between interference mitigation capability and CSI acquisition overhead. The model only depends on statistical CSI, thus enabling long-term clustering. Based on notions from coalitional game theory, we propose a low-complexity distributed clustering method. The algorithm converges in a couple of iterations, and only requires limited communication between base stations. Numerical simulations show the viability of the proposed approach.