파티션과 샤딩에 관하여

Mr.Manager 2025. 6. 30. 21:23

2025. 6. 30. 21:23

728x90

데이터베이스가 성장하면서 단일 테이블로는 더 이상 효율적인 데이터 관리가 어려워지는 시점이 옵니다. 특히 대용량 데이터를 다룰 때 성능 병목이 발생하게 되는데, 이를 해결하기 위한 대표적인 방법이 파티셔닝(Partitioning)과 샤딩(Sharding)입니다.

이번 글에서는 Spring Boot와 MySQL을 사용하여 Product 테이블을 예제로 두 방법의 차이점과 장단점을 살펴보겠습니다.

예제 시나리오

우리가 다룰 Product는 다음과 같은 구조를 가집니다:

@Entity
@Table(name = "product")
public class Product {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String name;
    private String type; // "상의", "하의", "모자", "신발"
    private BigDecimal price;
    private Integer stock;
    private LocalDateTime createdAt;

    // 생성자, getter, setter 생략
}

파티셔닝 (Partitioning)

파티셔닝이란?

파티셔닝은 하나의 데이터베이스 내에서 테이블을 논리적으로 분할하는 기법입니다. 물리적으로는 여전히 같은 데이터베이스에 존재하지만, 데이터를 여러 파티션으로 나누어 관리합니다.

파티셔닝 구조도

┌─────────────────────────────────────────┐
│              MySQL Database             │
├─────────────────────────────────────────┤
│  Product Table (Partitioned by type)    │
├─────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐      │
│  │ Partition 1 │  │ Partition 2 │      │
│  │   (상의)     │  │   (하의)     │      │
│  └─────────────┘  └─────────────┘      │
│  ┌─────────────┐  ┌─────────────┐      │
│  │ Partition 3 │  │ Partition 4 │      │
│  │   (모자)     │  │   (신발)     │      │
│  └─────────────┘  └─────────────┘      │
└─────────────────────────────────────────┘

MySQL 파티셔닝 구현

-- 파티션 테이블 생성
CREATE TABLE product (
    id BIGINT AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    type VARCHAR(50) NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    stock INT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (id, type)
) 
PARTITION BY LIST COLUMNS(type) (
    PARTITION p_top VALUES IN ('상의'),
    PARTITION p_bottom VALUES IN ('하의'),
    PARTITION p_hat VALUES IN ('모자'),
    PARTITION p_shoes VALUES IN ('신발')
);

Spring Boot에서 파티셔닝 활용

@Repository
public class ProductRepository {

    @Autowired
    private JdbcTemplate jdbcTemplate;

    // 특정 타입의 상품 조회 (파티션 프루닝 활용)
    public List<Product> findByType(String type) {
        String sql = "SELECT * FROM product WHERE type = ?";
        return jdbcTemplate.query(sql, new Object[]{type}, 
            (rs, rowNum) -> mapRowToProduct(rs));
    }

    // 전체 상품 조회 (모든 파티션 스캔)
    public List<Product> findAll() {
        String sql = "SELECT * FROM product";
        return jdbcTemplate.query(sql, (rs, rowNum) -> mapRowToProduct(rs));
    }

    private Product mapRowToProduct(ResultSet rs) throws SQLException {
        Product product = new Product();
        product.setId(rs.getLong("id"));
        product.setName(rs.getString("name"));
        product.setType(rs.getString("type"));
        product.setPrice(rs.getBigDecimal("price"));
        product.setStock(rs.getInt("stock"));
        product.setCreatedAt(rs.getTimestamp("created_at").toLocalDateTime());
        return product;
    }
}

파티셔닝의 장점

쿼리 성능 향상: 파티션 프루닝을 통해 필요한 파티션만 스캔
인덱스 효율성: 각 파티션마다 별도의 인덱스 관리
유지보수 용이성: 파티션별 백업, 복구 가능

파티셔닝의 한계

메모리 제약: 여전히 단일 데이터베이스의 메모리 한계
QPS 제한: 하나의 DB 서버가 처리할 수 있는 요청 수의 한계
스케일링 제약: 수직 확장(Scale-up)에만 의존

샤딩 (Sharding)

샤딩이란?

샤딩은 동일한 스키마를 가진 데이터를 여러 데이터베이스에 수평적으로 분산하는 기법입니다. 각 샤드는 독립적인 데이터베이스 서버에서 실행됩니다.

샤딩 구조도

┌─────────────────┐  ┌─────────────────┐
│   MySQL DB 1    │  │   MySQL DB 2    │
│   (상의 샤드)     │  │   (하의 샤드)      │
├─────────────────┤  ├─────────────────┤
│ Product Table   │  │ Product Table   │
│ - 상의 데이터      │  │ - 하의 데이터      │
└─────────────────┘  └─────────────────┘

┌─────────────────┐  ┌─────────────────┐
│   MySQL DB 3    │  │   MySQL DB 4    │
│   (모자 샤드)     │  │   (신발 샤드)     │
├─────────────────┤  ├─────────────────┤
│ Product Table   │  │ Product Table   │
│ - 모자 데이터      │  │ - 신발 데이터     │
└─────────────────┘  └─────────────────┘

Spring Boot에서 샤딩 구현

1. 다중 데이터소스 설정

@Configuration
public class ShardingConfig {

    @Bean
    @Primary
    public DataSource topDataSource() {
        return DataSourceBuilder.create()
            .url("jdbc:mysql://localhost:3306/product_top")
            .username("user")
            .password("password")
            .build();
    }

    @Bean
    public DataSource bottomDataSource() {
        return DataSourceBuilder.create()
            .url("jdbc:mysql://localhost:3307/product_bottom")
            .username("user")
            .password("password")
            .build();
    }

    @Bean
    public DataSource hatDataSource() {
        return DataSourceBuilder.create()
            .url("jdbc:mysql://localhost:3308/product_hat")
            .username("user")
            .password("password")
            .build();
    }

    @Bean
    public DataSource shoesDataSource() {
        return DataSourceBuilder.create()
            .url("jdbc:mysql://localhost:3309/product_shoes")
            .username("user")
            .password("password")
            .build();
    }
}

2. 샤드 라우팅 서비스

@Service
public class ShardingService {

    private final Map<String, JdbcTemplate> shardMap;

    public ShardingService(
            @Qualifier("topDataSource") DataSource topDs,
            @Qualifier("bottomDataSource") DataSource bottomDs,
            @Qualifier("hatDataSource") DataSource hatDs,
            @Qualifier("shoesDataSource") DataSource shoesDs) {

        this.shardMap = Map.of(
            "상의", new JdbcTemplate(topDs),
            "하의", new JdbcTemplate(bottomDs),
            "모자", new JdbcTemplate(hatDs),
            "신발", new JdbcTemplate(shoesDs)
        );
    }

    public JdbcTemplate getShardByType(String type) {
        JdbcTemplate shard = shardMap.get(type);
        if (shard == null) {
            throw new IllegalArgumentException("Unknown product type: " + type);
        }
        return shard;
    }

    public Collection<JdbcTemplate> getAllShards() {
        return shardMap.values();
    }
}

3. 샤딩 Repository 구현

@Repository
public class ShardedProductRepository {

    @Autowired
    private ShardingService shardingService;

    // 특정 타입의 상품 조회 (단일 샤드 접근)
    public List<Product> findByType(String type) {
        JdbcTemplate shard = shardingService.getShardByType(type);
        String sql = "SELECT * FROM product WHERE type = ?";
        return shard.query(sql, new Object[]{type}, this::mapRowToProduct);
    }

    // 상품 저장 (적절한 샤드로 라우팅)
    public void save(Product product) {
        JdbcTemplate shard = shardingService.getShardByType(product.getType());
        String sql = "INSERT INTO product (name, type, price, stock) VALUES (?, ?, ?, ?)";
        shard.update(sql, product.getName(), product.getType(), 
                    product.getPrice(), product.getStock());
    }

    // 전체 상품 조회 (모든 샤드 접근 - 성능 주의!)
    public List<Product> findAll() {
        List<Product> allProducts = new ArrayList<>();

        // 병렬 처리로 성능 개선
        List<CompletableFuture<List<Product>>> futures = 
            shardingService.getAllShards().stream()
                .map(shard -> CompletableFuture.supplyAsync(() -> 
                    shard.query("SELECT * FROM product", this::mapRowToProduct)))
                .collect(Collectors.toList());

        futures.forEach(future -> {
            try {
                allProducts.addAll(future.get());
            } catch (Exception e) {
                throw new RuntimeException("Failed to fetch from shard", e);
            }
        });

        return allProducts;
    }

    // 범위 검색 (가격 기준 - 모든 샤드 접근 필요)
    public List<Product> findByPriceRange(BigDecimal minPrice, BigDecimal maxPrice) {
        List<Product> results = new ArrayList<>();
        String sql = "SELECT * FROM product WHERE price BETWEEN ? AND ?";

        for (JdbcTemplate shard : shardingService.getAllShards()) {
            List<Product> shardResults = shard.query(sql, 
                new Object[]{minPrice, maxPrice}, this::mapRowToProduct);
            results.addAll(shardResults);
        }

        return results.stream()
                .sorted(Comparator.comparing(Product::getPrice))
                .collect(Collectors.toList());
    }

    private Product mapRowToProduct(ResultSet rs, int rowNum) throws SQLException {
        Product product = new Product();
        product.setId(rs.getLong("id"));
        product.setName(rs.getString("name"));
        product.setType(rs.getString("type"));
        product.setPrice(rs.getBigDecimal("price"));
        product.setStock(rs.getInt("stock"));
        product.setCreatedAt(rs.getTimestamp("created_at").toLocalDateTime());
        return product;
    }
}

샤딩의 장점

수평 확장성: 새로운 서버 추가로 용량과 성능 확장 가능
메모리 분산: 각 샤드가 독립적인 메모리 공간 사용
QPS 향상: 여러 서버가 동시에 요청 처리
장애 격리: 한 샤드의 장애가 다른 샤드에 영향 없음

샤딩의 어려움

1. 범위 검색의 복잡성

// 문제점: 모든 샤드를 조회해야 하므로 성능 저하
public List<Product> findExpensiveProducts(BigDecimal threshold) {
    List<Product> results = new ArrayList<>();

    // 모든 샤드에서 데이터 수집
    for (JdbcTemplate shard : shardingService.getAllShards()) {
        String sql = "SELECT * FROM product WHERE price > ?";
        results.addAll(shard.query(sql, new Object[]{threshold}, 
                      this::mapRowToProduct));
    }

    return results.stream()
            .sorted(Comparator.comparing(Product::getPrice).reversed())
            .collect(Collectors.toList());
}

2. 트랜잭션 관리의 복잡성

// 문제점: 여러 샤드에 걸친 트랜잭션 처리
@Transactional
public void transferStock(Long fromProductId, String fromType, 
                         Long toProductId, String toType, Integer quantity) {

    if (!fromType.equals(toType)) {
        // 서로 다른 샤드 간의 트랜잭션 - 복잡한 2PC 필요
        throw new UnsupportedOperationException(
            "Cross-shard transactions not supported");
    }

    JdbcTemplate shard = shardingService.getShardByType(fromType);
    // 같은 샤드 내에서만 트랜잭션 보장 가능
    shard.update("UPDATE product SET stock = stock - ? WHERE id = ?", 
                quantity, fromProductId);
    shard.update("UPDATE product SET stock = stock + ? WHERE id = ?", 
                quantity, toProductId);
}

3. 조인 쿼리의 제약

// 문제점: 서로 다른 샤드의 데이터를 조인할 수 없음
public class OrderService {

    // 주문과 상품이 다른 샤드에 있을 경우 애플리케이션 레벨에서 조인
    public OrderDetailDTO getOrderWithProducts(Long orderId) {
        Order order = orderRepository.findById(orderId);

        List<Product> products = new ArrayList<>();
        for (OrderItem item : order.getItems()) {
            // 각 상품을 개별적으로 조회 (N+1 문제 발생 가능)
            Product product = shardedProductRepository
                .findByType(item.getProductType())
                .stream()
                .filter(p -> p.getId().equals(item.getProductId()))
                .findFirst()
                .orElse(null);
            if (product != null) {
                products.add(product);
            }
        }

        return new OrderDetailDTO(order, products);
    }
}

파티셔닝 vs 샤딩 비교

항목	파티셔닝	샤딩
확장성	수직 확장만 가능	수평 확장 가능
메모리	단일 DB 메모리 한계	각 샤드별 독립적 메모리
QPS	단일 DB 서버 한계	여러 서버로 분산 처리
트랜잭션	ACID 보장	크로스 샤드 트랜잭션 복잡
조인	일반적인 조인 가능	크로스 샤드 조인 불가
범위 검색	효율적	모든 샤드 접근 필요
복잡성	상대적으로 단순	애플리케이션 레벨 복잡성 증가
장애 격리	전체 DB 영향	샤드별 독립적

실제 사용 시나리오

파티셔닝이 적합한 경우

@Service
public class AnalyticsService {

    // 특정 타입의 상품 분석 - 파티션 프루닝 활용
    public ProductAnalytics analyzeByType(String type) {
        List<Product> products = productRepository.findByType(type);

        return ProductAnalytics.builder()
            .totalCount(products.size())
            .averagePrice(calculateAveragePrice(products))
            .topSellingProducts(findTopSelling(products))
            .build();
    }
}

샤딩이 적합한 경우

@Service
public class ProductService {

    // 대용량 데이터 처리 - 각 샤드에서 병렬 처리
    public void updatePricesForType(String type, BigDecimal multiplier) {
        JdbcTemplate shard = shardingService.getShardByType(type);

        String sql = "UPDATE product SET price = price * ? WHERE type = ?";
        int updatedRows = shard.update(sql, multiplier, type);

        log.info("Updated {} products in {} shard", updatedRows, type);
    }

    // 높은 QPS 처리 - 각 샤드가 독립적으로 처리
    @Async
    public CompletableFuture<List<Product>> findTopProductsByType(String type) {
        JdbcTemplate shard = shardingService.getShardByType(type);

        String sql = "SELECT * FROM product WHERE type = ? ORDER BY price DESC LIMIT 10";
        List<Product> products = shard.query(sql, new Object[]{type}, 
                                           this::mapRowToProduct);

        return CompletableFuture.completedFuture(products);
    }
}

결론

파티셔닝과 샤딩은 각각 다른 상황에서 유용한 데이터베이스 확장 기법입니다.

파티셔닝은 비교적 단순한 구현으로 쿼리 성능을 향상시킬 수 있지만, 근본적인 확장성 문제는 해결하지 못합니다. 중간 규모의 데이터에서 성능 개선이 필요한 경우 적합합니다.

샤딩은 더 복잡한 구현이 필요하지만, 진정한 수평 확장을 통해 대용량 데이터와 높은 QPS를 처리할 수 있습니다. 다만 크로스 샤드 쿼리, 트랜잭션 관리 등의 복잡성을 감수해야 합니다.

실제 서비스에서는 데이터의 성격, 트래픽 패턴, 확장 계획을 종합적으로 고려하여 적절한 전략을 선택하는 것이 중요합니다. 때로는 두 방법을 조합하여 사용하는 것도 좋은 선택이 될 수 있습니다.

728x90

'Server' 카테고리의 다른 글

[Monorepo] 모노레포 그것이 답인가? (1)	2025.05.22
[해결 과제] 인증 서비스 왜 Refactoring 하였을까? (0)	2025.04.04
[EDA] EDA는 왜 적용하게 된걸까? (0)	2025.04.03
[MSA] 왜 MSA로 가야하나요? (0)	2025.04.03
[Architecture] Clean Architecture VS Hexagonal Architecture (0)	2025.01.14

Repository